WorldWideScience

Sample records for bioinformatics analysis identifies

  1. Bioinformatics analysis identifies several intrinsically disordered human E3 ubiquitin-protein ligases

    DEFF Research Database (Denmark)

    Boomsma, Wouter; Nielsen, Sofie V; Lindorff-Larsen, Kresten;

    2016-01-01

    conduct a bioinformatics analysis to examine >600 human and S. cerevisiae E3 ligases to identify enzymes that are similar to San1 in terms of function and/or mechanism of substrate recognition. An initial sequence-based database search was found to detect candidates primarily based on the homology...

  2. Shared Genetic Etiology between Type 2 Diabetes and Alzheimer's Disease Identified by Bioinformatics Analysis.

    Science.gov (United States)

    Gao, Lei; Cui, Zhen; Shen, Liang; Ji, Hong-Fang

    2015-01-01

    Type 2 diabetes (T2D) and Alzheimer's disease (AD) are two major health issues, and increasing evidence in recent years supports the close connection between these two diseases. The present study aimed to explore the shared genetic etiology underlying T2D and AD based on the available genome wide association studies (GWAS) data collected through August 2014. We performed bioinformatics analyses based on GWAS data of T2D and AD on single nucleotide polymorphisms (SNPs), gene, and pathway levels, respectively. Six SNPs (rs111789331, rs12721046, rs12721051, rs4420638, rs56131196, and rs66626994) were identified for the first time to be shared genetic factors between T2D and AD. Further functional enrichment analysis found lipid metabolism related pathways to be common between these two disorders. The findings may have important implications for future mechanistic and interventional studies for T2D and AD. PMID:26639962

  3. Transcriptome bioinformatic analysis identifies potential therapeutic mechanism of pentylenetetrazole in down syndrome

    Directory of Open Access Journals (Sweden)

    Sharma Abhay

    2010-10-01

    Full Text Available Abstract Background Pentylenetetrazole (PTZ has recently been found to ameliorate cognitive impairment in rodent models of Down syndrome (DS. The mechanism underlying PTZ's therapeutic effect in DS is however not clear. Microarray profiling has previously reported differential expression, both up- and down-regulation, of genes in DS. Given this, transcriptomic data related to PTZ treatment, if available, could be used to understand the drug's therapeutic mechanism in DS. No such mammalian data however exists. Nevertheless, a Drosophila model inspired by PTZ induced kindling plasticity in rodents has recently been described. Microarray profiling has shown PTZ's downregulatory effect on gene expression in the fly heads. Methods In a comparative transcriptomics approach, I have analyzed the available microarray data in order to identify potential therapeutic mechanism of PTZ in DS. In the analysis, summary data of up- and down-regulated genes reported in human DS studies and of down-regulated genes reported in the Drosophila model has been used. Results I find that transcriptomic correlate of chronic PTZ in Drosophila counteracts that of DS. Genes downregulated by PTZ significantly over-represent genes upregulated in DS and under-represent genes downregulated in DS. Further, the genes which are common in the downregulated and upregulated DS set show enrichment for MAP kinase pathway. Conclusion My analysis suggests that downregulation of MAP kinase pathway may mediate therapeutic effect of PTZ in DS. Existing evidence implicating MAP kinase pathway in DS supports this observation.

  4. Hypothetical granulin-like molecule from Fasciola hepatica identified by bioinformatics analysis.

    Science.gov (United States)

    Machicado, Claudia; Marcos, Luis A; Zimic, Mirko

    2016-01-01

    Fasciola hepatica is considered an emergent human pathogen, causing liver fibrosis or cirrhosis, conditions that are known to be direct causes of cancer. Some parasites have been categorized by WHO as carcinogenic agents such as Opisthorchis viverrini, a relative of F. hepatica. Although these two parasites are from the same class (Trematoda), the role of F. hepatica in carcinogenesis is unclear. We hypothesized that F. hepatica might share some features with O. viverrini and to be responsible to induce proliferation of host cells. We analyzed the recently released genome of F. hepatica looking for a gene coding a granulin-like growth factor, a protein secreted by O. viverrini (Ov-GRN-1), which is a potent stimulator of proliferation of host cells. Using computational biology tools, we identified a granulin-like molecule in F. hepatica, here termed FhGLM, which has high sequence identity level to Ov-GRN-1 and human progranulin. We found evidence of an upstream promoter compatible with the expression of FhGLM. The FhGLM architecture showed to have five granulin domains, one of them, the domain 3, was homologue to Ov-GRN-1 and human GRNC. The structure of the FhGLM granulin domain 3 resulted to have the overall folding of its homologue the human GRNC. Our findings show the presence of a homologue of a potent modulator of cell growth in F. hepatica that might have, as other granulins, a proliferative action on host cells during fascioliasis. Future experimental assays to demonstrate the presence of FhGLM in F. hepatica are needed to confirm our hypothesis. PMID:27386259

  5. Hypothetical granulin-like molecule from Fasciola hepatica identified by bioinformatics analysis.

    Science.gov (United States)

    Machicado, Claudia; Marcos, Luis A; Zimic, Mirko

    2016-01-01

    Fasciola hepatica is considered an emergent human pathogen, causing liver fibrosis or cirrhosis, conditions that are known to be direct causes of cancer. Some parasites have been categorized by WHO as carcinogenic agents such as Opisthorchis viverrini, a relative of F. hepatica. Although these two parasites are from the same class (Trematoda), the role of F. hepatica in carcinogenesis is unclear. We hypothesized that F. hepatica might share some features with O. viverrini and to be responsible to induce proliferation of host cells. We analyzed the recently released genome of F. hepatica looking for a gene coding a granulin-like growth factor, a protein secreted by O. viverrini (Ov-GRN-1), which is a potent stimulator of proliferation of host cells. Using computational biology tools, we identified a granulin-like molecule in F. hepatica, here termed FhGLM, which has high sequence identity level to Ov-GRN-1 and human progranulin. We found evidence of an upstream promoter compatible with the expression of FhGLM. The FhGLM architecture showed to have five granulin domains, one of them, the domain 3, was homologue to Ov-GRN-1 and human GRNC. The structure of the FhGLM granulin domain 3 resulted to have the overall folding of its homologue the human GRNC. Our findings show the presence of a homologue of a potent modulator of cell growth in F. hepatica that might have, as other granulins, a proliferative action on host cells during fascioliasis. Future experimental assays to demonstrate the presence of FhGLM in F. hepatica are needed to confirm our hypothesis.

  6. Somatic mutation profiles of MSI and MSS colorectal cancer identified by whole exome next generation sequencing and bioinformatics analysis.

    Directory of Open Access Journals (Sweden)

    Bernd Timmermann

    Full Text Available BACKGROUND: Colorectal cancer (CRC is with approximately 1 million cases the third most common cancer worldwide. Extensive research is ongoing to decipher the underlying genetic patterns with the hope to improve early cancer diagnosis and treatment. In this direction, the recent progress in next generation sequencing technologies has revolutionized the field of cancer genomics. However, one caveat of these studies remains the large amount of genetic variations identified and their interpretation. METHODOLOGY/PRINCIPAL FINDINGS: Here we present the first work on whole exome NGS of primary colon cancers. We performed 454 whole exome pyrosequencing of tumor as well as adjacent not affected normal colonic tissue from microsatellite stable (MSS and microsatellite instable (MSI colon cancer patients and identified more than 50,000 small nucleotide variations for each tissue. According to predictions based on MSS and MSI pathomechanisms we identified eight times more somatic non-synonymous variations in MSI cancers than in MSS and we were able to reproduce the result in four additional CRCs. Our bioinformatics filtering approach narrowed down the rate of most significant mutations to 359 for MSI and 45 for MSS CRCs with predicted altered protein functions. In both CRCs, MSI and MSS, we found somatic mutations in the intracellular kinase domain of bone morphogenetic protein receptor 1A, BMPR1A, a gene where so far germline mutations are associated with juvenile polyposis syndrome, and show that the mutations functionally impair the protein function. CONCLUSIONS/SIGNIFICANCE: We conclude that with deep sequencing of tumor exomes one may be able to predict the microsatellite status of CRC and in addition identify potentially clinically relevant mutations.

  7. Bioinformatics approaches for identifying new therapeutic bioactive peptides in food

    Directory of Open Access Journals (Sweden)

    Nora Khaldi

    2012-10-01

    Full Text Available ABSTRACT:The traditional methods for mining foods for bioactive peptides are tedious and long. Similar to the drug industry, the length of time to identify and deliver a commercial health ingredient that reduces disease symptoms can take anything between 5 to 10 years. Reducing this time and effort is crucial in order to create new commercially viable products with clear and important health benefits. In the past few years, bioinformatics, the science that brings together fast computational biology, and efficient genome mining, is appearing as the long awaited solution to this problem. By quickly mining food genomes for characteristics of certain food therapeutic ingredients, researchers can potentially find new ones in a matter of a few weeks. Yet, surprisingly, very little success has been achieved so far using bioinformatics in mining for food bioactives.The absence of food specific bioinformatic mining tools, the slow integration of both experimental mining and bioinformatics, and the important difference between different experimental platforms are some of the reasons for the slow progress of bioinformatics in the field of functional food and more specifically in bioactive peptide discovery.In this paper I discuss some methods that could be easily translated, using a rational peptide bioinformatics design, to food bioactive peptide mining. I highlight the need for an integrated food peptide database. I also discuss how to better integrate experimental work with bioinformatics in order to improve the mining of food for bioactive peptides, therefore achieving a higher success rates.

  8. Bioinformatics

    DEFF Research Database (Denmark)

    Baldi, Pierre; Brunak, Søren

    , and medicine will be particularly affected by the new results and the increased understanding of life at the molecular level. Bioinformatics is the development and application of computer methods for analysis, interpretation, and prediction, as well as for the design of experiments. It has emerged...

  9. Bioinformatic analysis of neurotropic HIV envelope sequences identifies polymorphisms in the gp120 bridging sheet that increase macrophage-tropism through enhanced interactions with CCR5.

    Science.gov (United States)

    Mefford, Megan E; Kunstman, Kevin; Wolinsky, Steven M; Gabuzda, Dana

    2015-07-01

    Macrophages express low levels of the CD4 receptor compared to T-cells. Macrophage-tropic HIV strains replicating in brain of untreated patients with HIV-associated dementia (HAD) express Envs that are adapted to overcome this restriction through mechanisms that are poorly understood. Here, bioinformatic analysis of env sequence datasets together with functional studies identified polymorphisms in the β3 strand of the HIV gp120 bridging sheet that increase M-tropism. D197, which results in loss of an N-glycan located near the HIV Env trimer apex, was detected in brain in some HAD patients, while position 200 was estimated to be under positive selection. D197 and T/V200 increased fusion and infection of cells expressing low CD4 by enhancing gp120 binding to CCR5. These results identify polymorphisms in the HIV gp120 bridging sheet that overcome the restriction to macrophage infection imposed by low CD4 through enhanced gp120-CCR5 interactions, thereby promoting infection of brain and other macrophage-rich tissues.

  10. Bioinformatic analysis of neurotropic HIV envelope sequences identifies polymorphisms in the gp120 bridging sheet that increase macrophage-tropism through enhanced interactions with CCR5

    Energy Technology Data Exchange (ETDEWEB)

    Mefford, Megan E., E-mail: megan_mefford@hms.harvard.edu [Department of Cancer Immunology and AIDS, Dana-Farber Cancer Institute, Boston, MA (United States); Kunstman, Kevin, E-mail: kunstman@northwestern.edu [Northwestern University Medical School, Chicago, IL (United States); Wolinsky, Steven M., E-mail: s-wolinsky@northwestern.edu [Northwestern University Medical School, Chicago, IL (United States); Gabuzda, Dana, E-mail: dana_gabuzda@dfci.harvard.edu [Department of Cancer Immunology and AIDS, Dana-Farber Cancer Institute, Boston, MA (United States); Department of Neurology (Microbiology and Immunobiology), Harvard Medical School, Boston, MA (United States)

    2015-07-15

    Macrophages express low levels of the CD4 receptor compared to T-cells. Macrophage-tropic HIV strains replicating in brain of untreated patients with HIV-associated dementia (HAD) express Envs that are adapted to overcome this restriction through mechanisms that are poorly understood. Here, bioinformatic analysis of env sequence datasets together with functional studies identified polymorphisms in the β3 strand of the HIV gp120 bridging sheet that increase M-tropism. D197, which results in loss of an N-glycan located near the HIV Env trimer apex, was detected in brain in some HAD patients, while position 200 was estimated to be under positive selection. D197 and T/V200 increased fusion and infection of cells expressing low CD4 by enhancing gp120 binding to CCR5. These results identify polymorphisms in the HIV gp120 bridging sheet that overcome the restriction to macrophage infection imposed by low CD4 through enhanced gp120–CCR5 interactions, thereby promoting infection of brain and other macrophage-rich tissues. - Highlights: • We analyze HIV Env sequences and identify amino acids in beta 3 of the gp120 bridging sheet that enhance macrophage tropism. • These amino acids at positions 197 and 200 are present in brain of some patients with HIV-associated dementia. • D197 results in loss of a glycan near the HIV Env trimer apex, which may increase exposure of V3. • These variants may promote infection of macrophages in the brain by enhancing gp120–CCR5 interactions.

  11. Bioinformatic analysis of neurotropic HIV envelope sequences identifies polymorphisms in the gp120 bridging sheet that increase macrophage-tropism through enhanced interactions with CCR5

    International Nuclear Information System (INIS)

    Macrophages express low levels of the CD4 receptor compared to T-cells. Macrophage-tropic HIV strains replicating in brain of untreated patients with HIV-associated dementia (HAD) express Envs that are adapted to overcome this restriction through mechanisms that are poorly understood. Here, bioinformatic analysis of env sequence datasets together with functional studies identified polymorphisms in the β3 strand of the HIV gp120 bridging sheet that increase M-tropism. D197, which results in loss of an N-glycan located near the HIV Env trimer apex, was detected in brain in some HAD patients, while position 200 was estimated to be under positive selection. D197 and T/V200 increased fusion and infection of cells expressing low CD4 by enhancing gp120 binding to CCR5. These results identify polymorphisms in the HIV gp120 bridging sheet that overcome the restriction to macrophage infection imposed by low CD4 through enhanced gp120–CCR5 interactions, thereby promoting infection of brain and other macrophage-rich tissues. - Highlights: • We analyze HIV Env sequences and identify amino acids in beta 3 of the gp120 bridging sheet that enhance macrophage tropism. • These amino acids at positions 197 and 200 are present in brain of some patients with HIV-associated dementia. • D197 results in loss of a glycan near the HIV Env trimer apex, which may increase exposure of V3. • These variants may promote infection of macrophages in the brain by enhancing gp120–CCR5 interactions

  12. Coronavirus Genomics and Bioinformatics Analysis

    Directory of Open Access Journals (Sweden)

    Kwok-Yung Yuen

    2010-08-01

    Full Text Available The drastic increase in the number of coronaviruses discovered and coronavirus genomes being sequenced have given us an unprecedented opportunity to perform genomics and bioinformatics analysis on this family of viruses. Coronaviruses possess the largest genomes (26.4 to 31.7 kb among all known RNA viruses, with G + C contents varying from 32% to 43%. Variable numbers of small ORFs are present between the various conserved genes (ORF1ab, spike, envelope, membrane and nucleocapsid and downstream to nucleocapsid gene in different coronavirus lineages. Phylogenetically, three genera, Alphacoronavirus, Betacoronavirus and Gammacoronavirus, with Betacoronavirus consisting of subgroups A, B, C and D, exist. A fourth genus, Deltacoronavirus, which includes bulbul coronavirus HKU11, thrush coronavirus HKU12 and munia coronavirus HKU13, is emerging. Molecular clock analysis using various gene loci revealed that the time of most recent common ancestor of human/civet SARS related coronavirus to be 1999-2002, with estimated substitution rate of 4´10-4 to 2´10-2 substitutions per site per year. Recombination in coronaviruses was most notable between different strains of murine hepatitis virus (MHV, between different strains of infectious bronchitis virus, between MHV and bovine coronavirus, between feline coronavirus (FCoV type I and canine coronavirus generating FCoV type II, and between the three genotypes of human coronavirus HKU1 (HCoV-HKU1. Codon usage bias in coronaviruses were observed, with HCoV-HKU1 showing the most extreme bias, and cytosine deamination and selection of CpG suppressed clones are the two major independent biological forces that shape such codon usage bias in coronaviruses.

  13. Bioinformatics approaches for identifying new therapeutic bioactive peptides in food

    OpenAIRE

    Nora Khaldi

    2012-01-01

    ABSTRACT:The traditional methods for mining foods for bioactive peptides are tedious and long. Similar to the drug industry, the length of time to identify and deliver a commercial health ingredient that reduces disease symptoms can take anything between 5 to 10 years. Reducing this time and effort is crucial in order to create new commercially viable products with clear and important health benefits. In the past few years, bioinformatics, the science that brings together fast computational b...

  14. Integrative cluster analysis in bioinformatics

    CERN Document Server

    Abu-Jamous, Basel; Nandi, Asoke K

    2015-01-01

    Clustering techniques are increasingly being put to use in the analysis of high-throughput biological datasets. Novel computational techniques to analyse high throughput data in the form of sequences, gene and protein expressions, pathways, and images are becoming vital for understanding diseases and future drug discovery. This book details the complete pathway of cluster analysis, from the basics of molecular biology to the generation of biological knowledge. The book also presents the latest clustering methods and clustering validation, thereby offering the reader a comprehensive review o

  15. Bioinformatic analysis of the nucleolus

    DEFF Research Database (Denmark)

    Leung, Anthony K L; Andersen, Jens S; Mann, Matthias;

    2003-01-01

    The nucleolus is a plurifunctional, nuclear organelle, which is responsible for ribosome biogenesis and many other functions in eukaryotes, including RNA processing, viral replication and tumour suppression. Our knowledge of the human nucleolar proteome has been expanded dramatically by the two...... novel or uncharacterized proteins. This review focuses on how to apply the derived knowledge of this newly recognized nucleolar proteome, such as their amino acid/peptide composition and their homologies across species, to explore the function and dynamics of the nucleolus, and suggests ways to identify......, in silico, possible functions of the novel/uncharacterized proteins and potential interaction networks within the human nucleolus, or between the nucleolus and other nuclear organelles, by drawing resources from the public domain....

  16. Applied bioinformatics: Genome annotation and transcriptome analysis

    DEFF Research Database (Denmark)

    Gupta, Vikas

    Next generation sequencing (NGS) has revolutionized the field of genomics and its wide range of applications has resulted in the genome-wide analysis of hundreds of species and the development of thousands of computational tools. This thesis represents my work on NGS analysis of four species, Lotus...... japonicus (Lotus), Vaccinium corymbosum (blueberry), Stegodyphus mimosarum (spider) and Trifolium occidentale (clover). From a bioinformatics data analysis perspective, my work can be divided into three parts; genome annotation, small RNA, and gene expression analysis. Lotus is a legume of significant...... agricultural and biological importance. Its capacity to form symbiotic relationships with rhizobia and microrrhizal fungi has fascinated researchers for years. Lotus has a small genome of approximately 470 Mb and a short life cycle of 2 to 3 months, which has made Lotus a model legume plant for many molecular...

  17. Bioinformatic analysis of patient-derived ASPS gene expressions and ASPL-TFE3 fusion transcript levels identify potential therapeutic targets.

    Directory of Open Access Journals (Sweden)

    David G Covell

    Full Text Available Gene expression data, collected from ASPS tumors of seven different patients and from one immortalized ASPS cell line (ASPS-1, was analyzed jointly with patient ASPL-TFE3 (t(X;17(p11;q25 fusion transcript data to identify disease-specific pathways and their component genes. Data analysis of the pooled patient and ASPS-1 gene expression data, using conventional clustering methods, revealed a relatively small set of pathways and genes characterizing the biology of ASPS. These results could be largely recapitulated using only the gene expression data collected from patient tumor samples. The concordance between expression measures derived from ASPS-1 and both pooled and individual patient tumor data provided a rationale for extending the analysis to include patient ASPL-TFE3 fusion transcript data. A novel linear model was exploited to link gene expressions to fusion transcript data and used to identify a small set of ASPS-specific pathways and their gene expression. Cellular pathways that appear aberrantly regulated in response to the t(X;17(p11;q25 translocation include the cell cycle and cell adhesion. The identification of pathways and gene subsets characteristic of ASPS support current therapeutic strategies that target the FLT1 and MET, while also proposing additional targeting of genes found in pathways involved in the cell cycle (CHK1, cell adhesion (ARHGD1A, cell division (CDC6, control of meiosis (RAD51L3 and mitosis (BIRC5, and chemokine-related protein tyrosine kinase activity (CCL4.

  18. Bioinformatics analysis of estrogen-responsive genes

    Science.gov (United States)

    Handel, Adam E.

    2016-01-01

    Estrogen is a steroid hormone that plays critical roles in a myriad of intracellular pathways. The expression of many genes is regulated through the steroid hormone receptors ESR1 and ESR2. These bind to DNA and modulate the expression of target genes. Identification of estrogen target genes is greatly facilitated by the use of transcriptomic methods, such as RNA-seq and expression microarrays, and chromatin immunoprecipitation with massively parallel sequencing (ChIP-seq). Combining transcriptomic and ChIP-seq data enables a distinction to be drawn between direct and indirect estrogen target genes. This chapter will discuss some methods of identifying estrogen target genes that do not require any expertise in programming languages or complex bioinformatics. PMID:26585125

  19. Bioinformatics Analysis of Estrogen-Responsive Genes.

    Science.gov (United States)

    Handel, Adam E

    2016-01-01

    Estrogen is a steroid hormone that plays critical roles in a myriad of intracellular pathways. The expression of many genes is regulated through the steroid hormone receptors ESR1 and ESR2. These bind to DNA and modulate the expression of target genes. Identification of estrogen target genes is greatly facilitated by the use of transcriptomic methods, such as RNA-seq and expression microarrays, and chromatin immunoprecipitation with massively parallel sequencing (ChIP-seq). Combining transcriptomic and ChIP-seq data enables a distinction to be drawn between direct and indirect estrogen target genes. This chapter discusses some methods of identifying estrogen target genes that do not require any expertise in programming languages or complex bioinformatics. PMID:26585125

  20. Bioinformatics analysis of metastasis-related proteins in hepatocellular carcinoma

    Institute of Scientific and Technical Information of China (English)

    Pei-Ming Song; Yang Zhang; Yu-Fei He; Hui-Min Bao; Jian-Hua Luo; Yin-Kun Liu; Peng-Yuan Yang; Xian Chen

    2008-01-01

    AIM: To analyze the metastasis-related proteins in hepatocellular carcinoma (HCC) and discover the biomark-er candidates for diagnosis and therapeutic intervention of HCC metastasis with bioinformatics tools.METHODS: Metastasis-related proteins were determined by stable isotope labeling and MS analysis and analyzed with bioinformatics resources, including Phobius, Kyoto encyclopedia of genes and genomes (KEGG), online mendelian inheritance in man (OHIH) and human protein reference database (HPRD).RESULTS: All the metastasis-related proteins were linked to 83 pathways in KEGG, including MAPK and p53 signal pathways. Protein-protein interaction network showed that all the metastasis-related proteins were categorized into 19 function groups, including cell cycle, apoptosis and signal transcluction. OMIM analysis linked these proteins to 186 OMIM entries.CONCLUSION: Metastasis-related proteins provide HCC cells with biological advantages in cell proliferation, migration and angiogenesis, and facilitate metastasis of HCC cells. The bird's eye view can reveal a global charac-teristic of metastasis-related proteins and many differen-tially expressed proteins can be identified as candidates for diagnosis and treatment of HCC.

  1. [Expression of bioinformatically identified genes in skin of psoriasis patients].

    Science.gov (United States)

    2013-10-01

    Gene expression analysis for EPHA2 (EPH receptor A2), EPHB2 (EPH receptor B2), S100A9 (S100 calcium binding protein A9), PBEF(nicotinamide phosphoribosyltransferase), LILRB2 (leukocyte immunoglobulin-like receptor, subfamily B (with TM and ITIM domains), member 2), PLAUR (plasminogen activator, urokinase receptor), LTB (lymphotoxin beta (TNF superfamily, member 3)), WNT5A (wingless-type MMTV integration site family, member 5A) has been conducted using real-time polymerase chain reaction in specimens affected by psoriasis versus visually intact skin in 18 patients. It was revealed that the expression of the nine examined genes was upregulated in the affected by psoriasis compared to visually intact skin specimens. The highest expression was observed for S100A9, S100AS, PBEF, WNT5A2, and EPHB2 genes. S100A9 and S100A8 gene expression in the affected by psoriasis skin was 100-fold higher versus visually intact skin while PBEF, WNT5A, and EPHB2 gene expression was upregulated more than five-fold. We suggested that the high expression of these genes might be associated with the state of the pathological process in psoriasis. Moreover, the transcriptional activity of these genes might serve a molecular indicator of the efficacy of treatment in psoriasis. PMID:25508677

  2. A Complementary Bioinformatics Approach to Identify Potential Plant Cell Wall Glycosyltransferase-Encoding Genes

    DEFF Research Database (Denmark)

    Egelund, Jack; Skjøt, Michael; Geshi, Naomi;

    2004-01-01

    . Although much is known with regard to composition and fine structures of the plant CW, only a handful of CW biosynthetic GT genes-all classified in the CAZy system-have been characterized. In an effort to identify CW GTs that have not yet been classified in the CAZy database, a simple bioinformatics...

  3. Bioinformatics analysis of disordered proteins in prokaryotes

    Directory of Open Access Journals (Sweden)

    Malkov Saša N

    2011-03-01

    Full Text Available Abstract Background A significant number of proteins have been shown to be intrinsically disordered, meaning that they lack a fixed 3 D structure or contain regions that do not posses a well defined 3 D structure. It has also been proven that a protein's disorder content is related to its function. We have performed an exhaustive analysis and comparison of the disorder content of proteins from prokaryotic organisms (i.e., superkingdoms Archaea and Bacteria with respect to functional categories they belong to, i.e., Clusters of Orthologous Groups of proteins (COGs and groups of COGs-Cellular processes (Cp, Information storage and processing (Isp, Metabolism (Me and Poorly characterized (Pc. We also analyzed the disorder content of proteins with respect to various genomic, metabolic and ecological characteristics of the organism they belong to. We used correlations and association rule mining in order to identify the most confident associations between specific modalities of the characteristics considered and disorder content. Results Bacteria are shown to have a somewhat higher level of protein disorder than archaea, except for proteins in the Me functional group. It is demonstrated that the Isp and Cp functional groups in particular (L-repair function and N-cell motility and secretion COGs of proteins in specific possess the highest disorder content, while Me proteins, in general, posses the lowest. Disorder fractions have been confirmed to have the lowest level for the so-called order-promoting amino acids and the highest level for the so-called disorder promoters. For each pair of organism characteristics, specific modalities are identified with the maximum disorder proteins in the corresponding organisms, e.g., high genome size-high GC content organisms, facultative anaerobic-low GC content organisms, aerobic-high genome size organisms, etc. Maximum disorder in archaea is observed for high GC content-low genome size organisms, high GC content

  4. Bioinformatics Analysis of MAPKKK Family Genes in Medicago truncatula

    Directory of Open Access Journals (Sweden)

    Wei Li

    2016-04-01

    Full Text Available Mitogen‐activated protein kinase kinase kinase (MAPKKK is a component of the MAPK cascade pathway that plays an important role in plant growth, development, and response to abiotic stress, the functions of which have been well characterized in several plant species, such as Arabidopsis, rice, and maize. In this study, we performed genome‐wide and systemic bioinformatics analysis of MAPKKK family genes in Medicago truncatula. In total, there were 73 MAPKKK family members identified by search of homologs, and they were classified into three subfamilies, MEKK, ZIK, and RAF. Based on the genomic duplication function, 72 MtMAPKKK genes were located throughout all chromosomes, but they cluster in different chromosomes. Using microarray data and high‐throughput sequencing‐data, we assessed their expression profiles in growth and development processes; these results provided evidence for exploring their important functions in developmental regulation, especially in the nodulation process. Furthermore, we investigated their expression in abiotic stresses by RNA‐seq, which confirmed their critical roles in signal transduction and regulation processes under stress. In summary, our genome‐wide, systemic characterization and expressional analysis of MtMAPKKK genes will provide insights that will be useful for characterizing the molecular functions of these genes in M. truncatula.

  5. Bioinformatics Analysis of MAPKKK Family Genes in Medicago truncatula.

    Science.gov (United States)

    Li, Wei; Xu, Hanyun; Liu, Ying; Song, Lili; Guo, Changhong; Shu, Yongjun

    2016-01-01

    Mitogen-activated protein kinase kinase kinase (MAPKKK) is a component of the MAPK cascade pathway that plays an important role in plant growth, development, and response to abiotic stress, the functions of which have been well characterized in several plant species, such as Arabidopsis, rice, and maize. In this study, we performed genome-wide and systemic bioinformatics analysis of MAPKKK family genes in Medicago truncatula. In total, there were 73 MAPKKK family members identified by search of homologs, and they were classified into three subfamilies, MEKK, ZIK, and RAF. Based on the genomic duplication function, 72 MtMAPKKK genes were located throughout all chromosomes, but they cluster in different chromosomes. Using microarray data and high-throughput sequencing-data, we assessed their expression profiles in growth and development processes; these results provided evidence for exploring their important functions in developmental regulation, especially in the nodulation process. Furthermore, we investigated their expression in abiotic stresses by RNA-seq, which confirmed their critical roles in signal transduction and regulation processes under stress. In summary, our genome-wide, systemic characterization and expressional analysis of MtMAPKKK genes will provide insights that will be useful for characterizing the molecular functions of these genes in M. truncatula.

  6. Whale song analyses using bioinformatics sequence analysis approaches

    Science.gov (United States)

    Chen, Yian A.; Almeida, Jonas S.; Chou, Lien-Siang

    2005-04-01

    Animal songs are frequently analyzed using discrete hierarchical units, such as units, themes and songs. Because animal songs and bio-sequences may be understood as analogous, bioinformatics analysis tools DNA/protein sequence alignment and alignment-free methods are proposed to quantify the theme similarities of the songs of false killer whales recorded off northeast Taiwan. The eighteen themes with discrete units that were identified in an earlier study [Y. A. Chen, masters thesis, University of Charleston, 2001] were compared quantitatively using several distance metrics. These metrics included the scores calculated using the Smith-Waterman algorithm with the repeated procedure; the standardized Euclidian distance and the angle metrics based on word frequencies. The theme classifications based on different metrics were summarized and compared in dendrograms using cluster analyses. The results agree with earlier classifications derived by human observation qualitatively. These methods further quantify the similarities among themes. These methods could be applied to the analyses of other animal songs on a larger scale. For instance, these techniques could be used to investigate song evolution and cultural transmission quantifying the dissimilarities of humpback whale songs across different seasons, years, populations, and geographic regions. [Work supported by SC Sea Grant, and Ilan County Government, Taiwan.

  7. Bioinformatics and Microarray Data Analysis on the Cloud.

    Science.gov (United States)

    Calabrese, Barbara; Cannataro, Mario

    2016-01-01

    High-throughput platforms such as microarray, mass spectrometry, and next-generation sequencing are producing an increasing volume of omics data that needs large data storage and computing power. Cloud computing offers massive scalable computing and storage, data sharing, on-demand anytime and anywhere access to resources and applications, and thus, it may represent the key technology for facing those issues. In fact, in the recent years it has been adopted for the deployment of different bioinformatics solutions and services both in academia and in the industry. Although this, cloud computing presents several issues regarding the security and privacy of data, that are particularly important when analyzing patients data, such as in personalized medicine. This chapter reviews main academic and industrial cloud-based bioinformatics solutions; with a special focus on microarray data analysis solutions and underlines main issues and problems related to the use of such platforms for the storage and analysis of patients data. PMID:25863787

  8. Biochip microsystem for bioinformatics recognition and analysis

    Science.gov (United States)

    Lue, Jaw-Chyng (Inventor); Fang, Wai-Chi (Inventor)

    2011-01-01

    A system with applications in pattern recognition, or classification, of DNA assay samples. Because DNA reference and sample material in wells of an assay may be caused to fluoresce depending upon dye added to the material, the resulting light may be imaged onto an embodiment comprising an array of photodetectors and an adaptive neural network, with applications to DNA analysis. Other embodiments are described and claimed.

  9. Bioinformatics Analysis of Zinc Transporter from Baoding Alfalfa

    Institute of Scientific and Technical Information of China (English)

    Haibo WANG; Junyun GUO

    2012-01-01

    [Objective] This study aimed to perform the bioinformatics analysis of Zinc transporter (ZnT) from Baoding Alfalfa. [Method] Based on the amino acid sequence, the physical and chemical properties, hydrophilicity/hydrophobicity, secondary structure of ZnT from Baoding alfalfa were predicted by a series of bioinformatics software. And the transmembrane domains were predicted by using different online tools. [Result] ZnT is a hydrophobic protein containing 408 amino acids with the theoretical pl of 5.94, and it has 7 potential transmembrane hydrophobic regions. In the sec- ondary structure, co-helix (Hh) accounted for 48.04%, extended strand (Ee) for 9.56%, random coil (Cc) for 42.40%, which was accored with the characteristic of transmembrane protein. [Conclusion] mZnT is a member of CDF family, responsible for transporting Zn^2+ out of the cell membrane to reduce the concentration and toxicity of Zn^2+.

  10. Integrated Bioinformatics, Environmental Epidemiologic and Genomic Approaches to Identify Environmental and Molecular Links between Endometriosis and Breast Cancer

    Directory of Open Access Journals (Sweden)

    Deodutta Roy

    2015-10-01

    Full Text Available We present a combined environmental epidemiologic, genomic, and bioinformatics approach to identify: exposure of environmental chemicals with estrogenic activity; epidemiologic association between endocrine disrupting chemical (EDC and health effects, such as, breast cancer or endometriosis; and gene-EDC interactions and disease associations. Human exposure measurement and modeling confirmed estrogenic activity of three selected class of environmental chemicals, polychlorinated biphenyls (PCBs, bisphenols (BPs, and phthalates. Meta-analysis showed that PCBs exposure, not Bisphenol A (BPA and phthalates, increased the summary odds ratio for breast cancer and endometriosis. Bioinformatics analysis of gene-EDC interactions and disease associations identified several hundred genes that were altered by exposure to PCBs, phthalate or BPA. EDCs-modified genes in breast neoplasms and endometriosis are part of steroid hormone signaling and inflammation pathways. All three EDCs–PCB 153, phthalates, and BPA influenced five common genes—CYP19A1, EGFR, ESR2, FOS, and IGF1—in breast cancer as well as in endometriosis. These genes are environmentally and estrogen responsive, altered in human breast and uterine tumors and endometriosis lesions, and part of Mitogen Activated Protein Kinase (MAPK signaling pathways in cancer. Our findings suggest that breast cancer and endometriosis share some common environmental and molecular risk factors.

  11. Bioinformatic Identification and Analysis of Extensins in the Plant Kingdom.

    Directory of Open Access Journals (Sweden)

    Xiao Liu

    Full Text Available Extensins (EXTs are a family of plant cell wall hydroxyproline-rich glycoproteins (HRGPs that are implicated to play important roles in plant growth, development, and defense. Structurally, EXTs are characterized by the repeated occurrence of serine (Ser followed by three to five prolines (Pro residues, which are hydroxylated as hydroxyproline (Hyp and glycosylated. Some EXTs have Tyrosine (Tyr-X-Tyr (where X can be any amino acid motifs that are responsible for intramolecular or intermolecular cross-linkings. EXTs can be divided into several classes: classical EXTs, short EXTs, leucine-rich repeat extensins (LRXs, proline-rich extensin-like receptor kinases (PERKs, formin-homolog EXTs (FH EXTs, chimeric EXTs, and long chimeric EXTs. To guide future research on the EXTs and understand evolutionary history of EXTs in the plant kingdom, a bioinformatics study was conducted to identify and classify EXTs from 16 fully sequenced plant genomes, including Ostreococcus lucimarinus, Chlamydomonas reinhardtii, Volvox carteri, Klebsormidium flaccidum, Physcomitrella patens, Selaginella moellendorffii, Pinus taeda, Picea abies, Brachypodium distachyon, Zea mays, Oryza sativa, Glycine max, Medicago truncatula, Brassica rapa, Solanum lycopersicum, and Solanum tuberosum, to supplement data previously obtained from Arabidopsis thaliana and Populus trichocarpa. A total of 758 EXTs were newly identified, including 87 classical EXTs, 97 short EXTs, 61 LRXs, 75 PERKs, 54 FH EXTs, 38 long chimeric EXTs, and 346 other chimeric EXTs. Several notable findings were made: (1 classical EXTs were likely derived after the terrestrialization of plants; (2 LRXs, PERKs, and FHs were derived earlier than classical EXTs; (3 monocots have few classical EXTs; (4 Eudicots have the greatest number of classical EXTs and Tyr-X-Tyr cross-linking motifs are predominantly in classical EXTs; (5 green algae have no classical EXTs but have a number of long chimeric EXTs that are absent in

  12. Bioinformatic Identification and Analysis of Extensins in the Plant Kingdom

    Science.gov (United States)

    Liu, Xiao; Wolfe, Richard; Welch, Lonnie R.; Domozych, David S.; Popper, Zoë A.; Showalter, Allan M.

    2016-01-01

    Extensins (EXTs) are a family of plant cell wall hydroxyproline-rich glycoproteins (HRGPs) that are implicated to play important roles in plant growth, development, and defense. Structurally, EXTs are characterized by the repeated occurrence of serine (Ser) followed by three to five prolines (Pro) residues, which are hydroxylated as hydroxyproline (Hyp) and glycosylated. Some EXTs have Tyrosine (Tyr)-X-Tyr (where X can be any amino acid) motifs that are responsible for intramolecular or intermolecular cross-linkings. EXTs can be divided into several classes: classical EXTs, short EXTs, leucine-rich repeat extensins (LRXs), proline-rich extensin-like receptor kinases (PERKs), formin-homolog EXTs (FH EXTs), chimeric EXTs, and long chimeric EXTs. To guide future research on the EXTs and understand evolutionary history of EXTs in the plant kingdom, a bioinformatics study was conducted to identify and classify EXTs from 16 fully sequenced plant genomes, including Ostreococcus lucimarinus, Chlamydomonas reinhardtii, Volvox carteri, Klebsormidium flaccidum, Physcomitrella patens, Selaginella moellendorffii, Pinus taeda, Picea abies, Brachypodium distachyon, Zea mays, Oryza sativa, Glycine max, Medicago truncatula, Brassica rapa, Solanum lycopersicum, and Solanum tuberosum, to supplement data previously obtained from Arabidopsis thaliana and Populus trichocarpa. A total of 758 EXTs were newly identified, including 87 classical EXTs, 97 short EXTs, 61 LRXs, 75 PERKs, 54 FH EXTs, 38 long chimeric EXTs, and 346 other chimeric EXTs. Several notable findings were made: (1) classical EXTs were likely derived after the terrestrialization of plants; (2) LRXs, PERKs, and FHs were derived earlier than classical EXTs; (3) monocots have few classical EXTs; (4) Eudicots have the greatest number of classical EXTs and Tyr-X-Tyr cross-linking motifs are predominantly in classical EXTs; (5) green algae have no classical EXTs but have a number of long chimeric EXTs that are absent in

  13. Bioinformatic Identification and Analysis of Extensins in the Plant Kingdom.

    Science.gov (United States)

    Liu, Xiao; Wolfe, Richard; Welch, Lonnie R; Domozych, David S; Popper, Zoë A; Showalter, Allan M

    2016-01-01

    Extensins (EXTs) are a family of plant cell wall hydroxyproline-rich glycoproteins (HRGPs) that are implicated to play important roles in plant growth, development, and defense. Structurally, EXTs are characterized by the repeated occurrence of serine (Ser) followed by three to five prolines (Pro) residues, which are hydroxylated as hydroxyproline (Hyp) and glycosylated. Some EXTs have Tyrosine (Tyr)-X-Tyr (where X can be any amino acid) motifs that are responsible for intramolecular or intermolecular cross-linkings. EXTs can be divided into several classes: classical EXTs, short EXTs, leucine-rich repeat extensins (LRXs), proline-rich extensin-like receptor kinases (PERKs), formin-homolog EXTs (FH EXTs), chimeric EXTs, and long chimeric EXTs. To guide future research on the EXTs and understand evolutionary history of EXTs in the plant kingdom, a bioinformatics study was conducted to identify and classify EXTs from 16 fully sequenced plant genomes, including Ostreococcus lucimarinus, Chlamydomonas reinhardtii, Volvox carteri, Klebsormidium flaccidum, Physcomitrella patens, Selaginella moellendorffii, Pinus taeda, Picea abies, Brachypodium distachyon, Zea mays, Oryza sativa, Glycine max, Medicago truncatula, Brassica rapa, Solanum lycopersicum, and Solanum tuberosum, to supplement data previously obtained from Arabidopsis thaliana and Populus trichocarpa. A total of 758 EXTs were newly identified, including 87 classical EXTs, 97 short EXTs, 61 LRXs, 75 PERKs, 54 FH EXTs, 38 long chimeric EXTs, and 346 other chimeric EXTs. Several notable findings were made: (1) classical EXTs were likely derived after the terrestrialization of plants; (2) LRXs, PERKs, and FHs were derived earlier than classical EXTs; (3) monocots have few classical EXTs; (4) Eudicots have the greatest number of classical EXTs and Tyr-X-Tyr cross-linking motifs are predominantly in classical EXTs; (5) green algae have no classical EXTs but have a number of long chimeric EXTs that are absent in

  14. Secretome Analysis of Lipid-Induced Insulin Resistance in Skeletal Muscle Cells by a Combined Experimental and Bioinformatics Workflow

    DEFF Research Database (Denmark)

    Deshmukh, Atul S; Cox, Juergen; Jensen, Lars Juhl;

    2015-01-01

    , in principle, allows an unbiased and comprehensive analysis of cellular secretomes; however, the distinction of bona fide secreted proteins from proteins released upon lysis of a small fraction of dying cells remains challenging. Here we applied highly sensitive MS and streamlined bioinformatics to analyze......-resistant conditions. Our study demonstrates an efficient combined experimental and bioinformatics workflow to identify putative secreted proteins from insulin-resistant skeletal muscle cells, which could easily be adapted to other cellular models....

  15. Workflows in bioinformatics: meta-analysis and prototype implementation of a workflow generator

    Directory of Open Access Journals (Sweden)

    Thoraval Samuel

    2005-04-01

    Full Text Available Abstract Background Computational methods for problem solving need to interleave information access and algorithm execution in a problem-specific workflow. The structures of these workflows are defined by a scaffold of syntactic, semantic and algebraic objects capable of representing them. Despite the proliferation of GUIs (Graphic User Interfaces in bioinformatics, only some of them provide workflow capabilities; surprisingly, no meta-analysis of workflow operators and components in bioinformatics has been reported. Results We present a set of syntactic components and algebraic operators capable of representing analytical workflows in bioinformatics. Iteration, recursion, the use of conditional statements, and management of suspend/resume tasks have traditionally been implemented on an ad hoc basis and hard-coded; by having these operators properly defined it is possible to use and parameterize them as generic re-usable components. To illustrate how these operations can be orchestrated, we present GPIPE, a prototype graphic pipeline generator for PISE that allows the definition of a pipeline, parameterization of its component methods, and storage of metadata in XML formats. This implementation goes beyond the macro capacities currently in PISE. As the entire analysis protocol is defined in XML, a complete bioinformatic experiment (linked sets of methods, parameters and results can be reproduced or shared among users. Availability: http://if-web1.imb.uq.edu.au/Pise/5.a/gpipe.html (interactive, ftp://ftp.pasteur.fr/pub/GenSoft/unix/misc/Pise/ (download. Conclusion From our meta-analysis we have identified syntactic structures and algebraic operators common to many workflows in bioinformatics. The workflow components and algebraic operators can be assimilated into re-usable software components. GPIPE, a prototype implementation of this framework, provides a GUI builder to facilitate the generation of workflows and integration of heterogeneous

  16. Proof of concept: A bioinformatic and serological screening method for identifying new peptide antigens for Chlamydia trachomatis related sequelae in women☆

    OpenAIRE

    Stansfield, Scott H.; Patel, Pooja; Debattista, Joseph; Charles W Armitage; Cunningham, Kelly; Timms, Peter; Allan, John; Mittal, Aruna; Huston, Wilhelmina M.

    2013-01-01

    This study aimed to identify new peptide antigens from Chlamydia (C.) trachomatis in a proof of concept approach which could be used to develop an epitope-based serological diagnostic for C. trachomatis related infertility in women. A bioinformatics analysis was conducted examining several immunodominant proteins from C. trachomatis to identify predicted immunoglobulin epitopes unique to C. trachomatis. A peptide array of these epitopes was screened against participant sera. The participants ...

  17. ISEV position paper: extracellular vesicle RNA analysis and bioinformatics

    Directory of Open Access Journals (Sweden)

    Andrew F. Hill

    2013-12-01

    Full Text Available Extracellular vesicles (EVs are the collective term for the various vesicles that are released by cells into the extracellular space. Such vesicles include exosomes and microvesicles, which vary by their size and/or protein and genetic cargo. With the discovery that EVs contain genetic material in the form of RNA (evRNA has come the increased interest in these vesicles for their potential use as sources of disease biomarkers and potential therapeutic agents. Rapid developments in the availability of deep sequencing technologies have enabled the study of EV-related RNA in detail. In October 2012, the International Society for Extracellular Vesicles (ISEV held a workshop on “evRNA analysis and bioinformatics.” Here, we report the conclusions of one of the roundtable discussions where we discussed evRNA analysis technologies and provide some guidelines to researchers in the field to consider when performing such analysis.

  18. Integrated Bioinformatics, Environmental Epidemiologic and Genomic Approaches to Identify Environmental and Molecular Links between Endometriosis and Breast Cancer

    OpenAIRE

    Deodutta Roy; Marisa Morgan; Changwon Yoo; Alok Deoraj; Sandhya Roy; Vijay Kumar Yadav; Mohannad Garoub; Hamza Assaggaf; Mayur Doke

    2015-01-01

    We present a combined environmental epidemiologic, genomic, and bioinformatics approach to identify: exposure of environmental chemicals with estrogenic activity; epidemiologic association between endocrine disrupting chemical (EDC) and health effects, such as, breast cancer or endometriosis; and gene-EDC interactions and disease associations. Human exposure measurement and modeling confirmed estrogenic activity of three selected class of environmental chemicals, polychlorinated biphenyls (PC...

  19. The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis.

    OpenAIRE

    Alva, V.; Nam, S.; Söding, J.; Lupas, A.

    2016-01-01

    The MPI Bioinformatics Toolkit (http://toolkit.tuebingen.mpg.de) is an open, interactive web service for comprehensive and collaborative protein bioinformatic analysis. It offers a wide array of interconnected, state-of-the-art bioinformatics tools to experts and non-experts alike, developed both externally (e.g. BLAST+, HMMER3, MUSCLE) and internally (e.g. HHpred, HHblits, PCOILS). While a beta version of the Toolkit was released 10 years ago, the current production-level release has been av...

  20. Proof of concept: A bioinformatic and serological screening method for identifying new peptide antigens for Chlamydia trachomatis related sequelae in women.

    Science.gov (United States)

    Stansfield, Scott H; Patel, Pooja; Debattista, Joseph; Armitage, Charles W; Cunningham, Kelly; Timms, Peter; Allan, John; Mittal, Aruna; Huston, Wilhelmina M

    2013-01-01

    This study aimed to identify new peptide antigens from Chlamydia (C.) trachomatis in a proof of concept approach which could be used to develop an epitope-based serological diagnostic for C. trachomatis related infertility in women. A bioinformatics analysis was conducted examining several immunodominant proteins from C. trachomatis to identify predicted immunoglobulin epitopes unique to C. trachomatis. A peptide array of these epitopes was screened against participant sera. The participants (all female) were categorized into the following cohorts based on their infection and gynecological history; acute (single treated infection with C. trachomatis), multiple (more than one C. trachomatis infection, all treated), sequelae (PID or tubal infertility with a history of C. trachomatis infection), and infertile (no history of C. trachomatis infection and no detected tubal damage). The bioinformatics strategy identified several promising epitopes. Participants who reacted positively in the peptide 11 ELISA were found to have an increased likelihood of being in the sequelae cohort compared to the infertile cohort with an odds ratio of 16.3 (95% c.i. 1.65-160), with 95% specificity and 46% sensitivity (0.19-0.74). The peptide 11 ELISA has the potential to be further developed as a screening tool for use during the early IVF work up and provides proof of concept that there may be further peptide antigens which could be identified using bioinformatics and screening approaches. PMID:24600556

  1. Proof of concept: A bioinformatic and serological screening method for identifying new peptide antigens for Chlamydia trachomatis related sequelae in women☆

    Science.gov (United States)

    Stansfield, Scott H.; Patel, Pooja; Debattista, Joseph; Armitage, Charles W.; Cunningham, Kelly; Timms, Peter; Allan, John; Mittal, Aruna; Huston, Wilhelmina M.

    2013-01-01

    This study aimed to identify new peptide antigens from Chlamydia (C.) trachomatis in a proof of concept approach which could be used to develop an epitope-based serological diagnostic for C. trachomatis related infertility in women. A bioinformatics analysis was conducted examining several immunodominant proteins from C. trachomatis to identify predicted immunoglobulin epitopes unique to C. trachomatis. A peptide array of these epitopes was screened against participant sera. The participants (all female) were categorized into the following cohorts based on their infection and gynecological history; acute (single treated infection with C. trachomatis), multiple (more than one C. trachomatis infection, all treated), sequelae (PID or tubal infertility with a history of C. trachomatis infection), and infertile (no history of C. trachomatis infection and no detected tubal damage). The bioinformatics strategy identified several promising epitopes. Participants who reacted positively in the peptide 11 ELISA were found to have an increased likelihood of being in the sequelae cohort compared to the infertile cohort with an odds ratio of 16.3 (95% c.i. 1.65–160), with 95% specificity and 46% sensitivity (0.19–0.74). The peptide 11 ELISA has the potential to be further developed as a screening tool for use during the early IVF work up and provides proof of concept that there may be further peptide antigens which could be identified using bioinformatics and screening approaches. PMID:24600556

  2. Mutational and Bioinformatic Analysis of Haloarchaeal Lipobox-Containing Proteins

    Directory of Open Access Journals (Sweden)

    Stefanie Storf

    2010-01-01

    Full Text Available A conserved lipid-modified cysteine found in a protein motif commonly referred to as a lipobox mediates the membrane anchoring of a subset of proteins transported across the bacterial cytoplasmic membrane via the Sec pathway. Sequenced haloarchaeal genomes encode many putative lipoproteins and recent studies have confirmed the importance of the conserved lipobox cysteine for signal peptide processing of three lipobox-containing proteins in the model archaeon Haloferax volcanii. We have extended these in vivo analyses to additional Hfx. volcanii substrates, supporting our previous in silico predictions and confirming the diversity of predicted Hfx. volcanii lipoproteins. Moreover, using extensive comparative secretome analyses, we identified genes encodining putative lipoproteins across a wide range of archaeal species. While our in silico analyses, supported by in vivo data, indicate that most haloarchaeal lipoproteins are Tat substrates, these analyses also predict that many crenarchaeal species lack lipoproteins altogether and that other archaea, such as nonhalophilic euryarchaeal species, transport lipoproteins via the Sec pathway. To facilitate the identification of genes that encode potential haloarchaeal Tat-lipoproteins, we have developed TatLipo, a bioinformatic tool designed to detect lipoboxes in haloarchaeal Tat signal peptides. Our results provide a strong foundation for future studies aimed at identifying components of the archaeal lipoprotein biogenesis pathway.

  3. Bioinformatic Analysis of Structural Proteins of Paramyxovirus Tianjin Strain

    Institute of Scientific and Technical Information of China (English)

    Li-ying SHI; Mei LI; Xiao-mian LI; Li-jun YUAN; Qing WANG

    2008-01-01

    The amino acid sequences of the NP,P, M, F,HN and L proteins of the paramyxovirus Tianjin strain were analyzed by using the bioinformatics methods. Phylogenetic analysis based on 6 structural proteins among the Tianjin strain and 25 paramyxoviruses showed that the Tianjin strain belonged to the genus Respirovirus, in the subfamily Paramyxovirinae, and was most closely related to Sendal virus (SeV). Phylogenetic analysis with 14 known SeVs showed that Tianjin strain represented a new evolutionary lineage. Similarities comparisons indicated that Tianjin strain P protein was poorly conserved, sharing only 78.7%-91.9% amino acid identity with the known SeVs, while the L protein was the most conserved, having 96.0%-98.0% amino acid identity with the known SeVs. Alignments of amino acid sequences of 6 structural proteins clearly showed that Tianjin strain possessed many unique amino acid substitutions in their protein sequences, 15 in NP, 29 in P, 6 in M, 13 in F, 18 in HN, and 29 in L. These results revealed that Tianjin strain was most likely a new genotype of SeV. The presence of unique amino acid substitutions suggests that Tianjin strain maybe has a significant difference in biological, pathological, immunological, or epidemiological characteristics from the known SeVs.

  4. ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis.

    Directory of Open Access Journals (Sweden)

    Michael Römer

    Full Text Available Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/.

  5. ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis.

    Science.gov (United States)

    Römer, Michael; Eichner, Johannes; Dräger, Andreas; Wrzodek, Clemens; Wrzodek, Finja; Zell, Andreas

    2016-01-01

    Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT) Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/.

  6. ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis.

    Science.gov (United States)

    Römer, Michael; Eichner, Johannes; Dräger, Andreas; Wrzodek, Clemens; Wrzodek, Finja; Zell, Andreas

    2016-01-01

    Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT) Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/. PMID:26882475

  7. Somatic populations of PGT135-137 HIV-1-neutralizing antibodies identified by 454 pyrosequencing and bioinformatics

    Directory of Open Access Journals (Sweden)

    Jiang eZhu

    2012-09-01

    Full Text Available Select HIV-1-infected individuals develop sera capable of neutralizing diverse viral strains. The molecular basis of this neutralization is currently being deciphered by the isolation of HIV-1-neutralizing antibodies. In one infected donor, three neutralizing antibodies, PGT135-137, were identified by assessment of neutralization from individually sorted B cells and found to recognize an epitope containing an N-linked glycan at residue 332 on HIV-1 gp120. Here we use deep sequencing and bioinformatics methods to interrogate the B cell record of this donor to gain a more complete understanding of the humoral immune response. PGT135-137-gene family-specific primers were used to amplify heavy and light chain-variable domain sequences. 454 pyrosequencing produced 141,298 heavy-chain sequences of IGHV4-39 origin and 87,229 light-chain sequences of IGKV3-15 origin. A number of heavy and light chain sequences of ~90% identity to PGT137, several to PGT136, and none of high identity to PGT135 were identified. After expansion of these sequences to include close phylogenetic relatives, a total of 202 heavy-chain sequences and 72 light-chain sequences were identified. These sequences were clustered into populations of 95% identity comprising 15 for heavy chain and 10 for light chain, and a select sequence from each population was synthesized and reconstituted with a PGT137-partner chain. Reconstituted antibodies showed varied neutralization phenotypes for HIV-1 clade A and D isolates. Sequence diversity of the antibody population represented by these tested sequences was notably higher than observed with a 454 pyrosequencing-control analysis on 10 antibodies of defined sequence, suggesting that this diversity results primarily from somatic maturation. Our results thus provide an example of how pathogens like HIV-1 are opposed by a varied humoral immune response, derived from intrinsic mechanisms of antibody development, and embodied by somatic populations

  8. Bioinformatics analysis of two-component regulatory systems in Staphylococcus epidermidis

    Institute of Scientific and Technical Information of China (English)

    QIN Zhiqiang; ZHONG Yang; ZHANG Jian; HE Youyu; WU Yang; JIANG Juan; CHEN Jiemin; LUO Xiaomin; QU Di

    2004-01-01

    Sixteen pairs of two-component regulatory systems are identified in the genome of Staphylococcus epidermidis ATCC12228 strain, which is newly sequenced by our laboratory for Medical Molecular Virology and Chinese National Human Genome Center at Shanghai, by using bioinformatics analysis. Comparative analysis of the twocomponent regulatory systems in S. epidermidis and that of S.aureus and Bacillus subtilis shows that these systems may regulate some important biological functions, e.g. growth,biofilm formation, and expression of virulence factors in S.epidermidis. Two conserved domains, i.e. HATPase_c and REC domains, are found in all 16 pairs of two-component proteins.Homologous modelling analysis indicates that there are 4similar HATPase_c domain structures of histidine kinases and 13 similar REC domain structures of response regulators,and there is one AMP-PNP binding pocket in the HATPase_c domain and three active aspartate residues in the REC domain. Preliminary experiment reveals that the bioinformatics analysis of the conserved domain structures in the two-component regulatory systems in S. epidermidis may provide useful information for discovery of potential drug target.

  9. Bioinformatic science and devices for computer analysis and visualization of macromolecules

    Directory of Open Access Journals (Sweden)

    Yu.B. Porozov

    2010-06-01

    Full Text Available The goals and objectives of bioinformatic science are presented in the article. The main methods and approaches used in computer biology are highlighted. Areas in which bioinformatic science can greatly facilitate and speed up the work of practical biologist and pharmacologist are revealed. The features of both the basic packages and software devices for complete, thorough analysis of macromolecules and for development and modeling of ligands and binding centers are described

  10. Will solid-state drives accelerate your bioinformatics? In-depth profiling, performance analysis and beyond.

    Science.gov (United States)

    Lee, Sungmin; Min, Hyeyoung; Yoon, Sungroh

    2016-07-01

    A wide variety of large-scale data have been produced in bioinformatics. In response, the need for efficient handling of biomedical big data has been partly met by parallel computing. However, the time demand of many bioinformatics programs still remains high for large-scale practical uses because of factors that hinder acceleration by parallelization. Recently, new generations of storage devices have emerged, such as NAND flash-based solid-state drives (SSDs), and with the renewed interest in near-data processing, they are increasingly becoming acceleration methods that can accompany parallel processing. In certain cases, a simple drop-in replacement of hard disk drives by SSDs results in dramatic speedup. Despite the various advantages and continuous cost reduction of SSDs, there has been little review of SSD-based profiling and performance exploration of important but time-consuming bioinformatics programs. For an informative review, we perform in-depth profiling and analysis of 23 key bioinformatics programs using multiple types of devices. Based on the insight we obtain from this research, we further discuss issues related to design and optimize bioinformatics algorithms and pipelines to fully exploit SSDs. The programs we profile cover traditional and emerging areas of importance, such as alignment, assembly, mapping, expression analysis, variant calling and metagenomics. We explain how acceleration by parallelization can be combined with SSDs for improved performance and also how using SSDs can expedite important bioinformatics pipelines, such as variant calling by the Genome Analysis Toolkit and transcriptome analysis using RNA sequencing. We hope that this review can provide useful directions and tips to accompany future bioinformatics algorithm design procedures that properly consider new generations of powerful storage devices. PMID:26330577

  11. Bioinformatics Analysis of the Duck Enteritis Virus UL54 Gene

    Directory of Open Access Journals (Sweden)

    Chaoyue Liu

    2014-04-01

    Full Text Available In this study, we analyze the Duck Enteritis Virus (DEV UL54 gene, which has been isolated and identified in our lab (GenBank accession NO EU071033, to help deeply research on DEV. DNA sequence analysis showed that the identified ORF which composed of 1377 bp nucleotides encoded 458 amino acids with a predicted Mr. of 51.75 kDa. Multiple sequence alignment suggested that the UL54 gene was highly conserved in Alphaherpesvirinae and was similar to the other herpesviral UL54 gene. Phylogenetic analysis of the DEV UL54 gene revealed that DEV had a close evolutionary relationship with Gallid, Herpesvirus 2 (GaHV-2, Gallid Herpesvirus 3 (GaHV-3, Meleagrid Herpesvirus1 (MeHV-1 and should belong to a single cluster within the Alphaherpesvirinae subfamily.

  12. Cloning, expression, purification and bioinformatic analysis of 2-methylcitrate synthase from Mycobacterium tuberculosis

    Institute of Scientific and Technical Information of China (English)

    Kandasamy Eniyan; Urmi Bajpai

    2015-01-01

    Objective:To clone, express and purify2-methylcitrate synthase(Rv1131) gene of Mycobacterium tuberculosis(M. tuberculosis) and to study its structural characteristics using various bioinformatics tools.Methods:Rv1131 gene was amplified by polymerase chain reaction usingM. tuberculosisH37Rv genomicDNA and cloned into pGEM-T easy vector and sequenced. The gene was sub-cloned in pET28c vector, expressed inEscherichia coliBL21(E. coliBL21) (DE3) cells and the recombinant protein was identified byWestern blotting.The protein was purified usingNickel affinity chromatography and the structural characteristics like sub-cellular localization, presence of transmembrane helices and secondary structure of the protein were predicted by bioinformatics tools.Tertiary structure of the protein and phylogenetic analysis was also established byin silico analysis.Results:The expression of the recombinant protein (Rv1131) was confirmed by western blotting using anti-HIS antibodies and the protein was purified from the soluble fraction.In silicoanalysis showed that the protein contains no signal peptide and transmembrane helices.Active site prediction showed that the protein has histidine and aspartic acid residues at242,281 &332 positions respectively.Phylogenetic analysis showed 100% homology withmajor mycobacterial species.Secondary structure predicts2-methylcitrate synthase contain51.9% alpha-helix,8.7% extended strand and39.4% random coils.Tertiary structure of the protein was also established.Conclusions:The enzyme2-methylcitrate synthase from M. tuberculosisH37Rv has been successfully expressed and purified.The purified protein will further be utilized to develop assay methods for screening new inhibitors.

  13. Cake: a bioinformatics pipeline for the integrated analysis of somatic variants in cancer genomes

    Science.gov (United States)

    Rashid, Mamunur; Robles-Espinoza, Carla Daniela; Rust, Alistair G.; Adams, David J.

    2013-01-01

    Summary: We have developed Cake, a bioinformatics software pipeline that integrates four publicly available somatic variant-calling algorithms to identify single nucleotide variants with higher sensitivity and accuracy than any one algorithm alone. Cake can be run on a high-performance computer cluster or used as a stand-alone application. Availabilty: Cake is open-source and is available from http://cakesomatic.sourceforge.net/ Contact: da1@sanger.ac.uk Supplementary Information: Supplementary data are available at Bioinformatics online. PMID:23803469

  14. Expression and bioinformatic analysis of lymphoma-associated novel gene KIAA0372

    Institute of Scientific and Technical Information of China (English)

    BAI Xiangyang; TANG Duozhuang; ZHU Tao; SUN Lishi; YAN Lingling; LU Yunping; ZHOU Jianfeng; MA Ding

    2007-01-01

    The purpose of this study was to explore the differentially expressed genes in lymph-node cells (LNC) of lymphomas and reactive lymph node hyperplasia,and to perform an initial bioinformatic analysis on a novel gene,KIAA0372,which is highly expressed in the LNC of lymphomas.mRNA extracted from LNC of lymphomas and reactive lymph node hyperplasia were respectively marked with biotin and hybridized with Gene Expression Chips,resulting in differentially expressed genes.Initial bioinformatic analysis was then performed on a novel gene named KIAA0372,whose function has not yet been explored.Its structure and genomic location,its product's physical and chemical properties,subcellular localization and functional domains,were also predicted.Further,a systematic evolution analysis was performed on similar proteins from among several species.Using Gene Expression Chips,many differentially expressed genes were uncovered.Efficient bioinformatic analysis has fundamentally determined that KIAA0372 is an extracellular protein which may be involved in TGF-β signaling.Microarray is an efficient and high throughput strategy for detection of differentially expressed genes.And KIAA0372 is thought to be a potential target for tumor research using bioinformatic analysis.

  15. The Revolution in Viral Genomics as Exemplified by the Bioinformatic Analysis of Human Adenoviruses

    Directory of Open Access Journals (Sweden)

    Sarah Torres

    2010-06-01

    Full Text Available Over the past 30 years, genomic and bioinformatic analysis of human adenoviruses has been achieved using a variety of DNA sequencing methods; initially with the use of restriction enzymes and more currently with the use of the GS FLX pyrosequencing technology. Following the conception of DNA sequencing in the 1970s, analysis of adenoviruses has evolved from 100 base pair mRNA fragments to entire genomes. Comparative genomics of adenoviruses made its debut in 1984 when nucleotides and amino acids of coding sequences within the hexon genes of two human adenoviruses (HAdV, HAdV–C2 and HAdV–C5, were compared and analyzed. It was determined that there were three different zones (1-393, 394-1410, 1411-2910 within the hexon gene, of which HAdV–C2 and HAdV–C5 shared zones 1 and 3 with 95% and 89.5% nucleotide identity, respectively. In 1992, HAdV-C5 became the first adenovirus genome to be fully sequenced using the Sanger method. Over the next seven years, whole genome analysis and characterization was completed using bioinformatic tools such as blastn, tblastx, ClustalV and FASTA, in order to determine key proteins in species HAdV-A through HAdV-F. The bioinformatic revolution was initiated with the introduction of a novel species, HAdV-G, that was typed and named by the use of whole genome sequencing and phylogenetics as opposed to traditional serology. HAdV bioinformatics will continue to advance as the latest sequencing technology enables scientists to add to and expand the resource databases. As a result of these advancements, how novel HAdVs are typed has changed. Bioinformatic analysis has become the revolutionary tool that has significantly accelerated the in-depth study of HAdV microevolution through comparative genomics.

  16. The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis.

    Science.gov (United States)

    Alva, Vikram; Nam, Seung-Zin; Söding, Johannes; Lupas, Andrei N

    2016-07-01

    The MPI Bioinformatics Toolkit (http://toolkit.tuebingen.mpg.de) is an open, interactive web service for comprehensive and collaborative protein bioinformatic analysis. It offers a wide array of interconnected, state-of-the-art bioinformatics tools to experts and non-experts alike, developed both externally (e.g. BLAST+, HMMER3, MUSCLE) and internally (e.g. HHpred, HHblits, PCOILS). While a beta version of the Toolkit was released 10 years ago, the current production-level release has been available since 2008 and has serviced more than 1.6 million external user queries. The usage of the Toolkit has continued to increase linearly over the years, reaching more than 400 000 queries in 2015. In fact, through the breadth of its tools and their tight interconnection, the Toolkit has become an excellent platform for experimental scientists as well as a useful resource for teaching bioinformatic inquiry to students in the life sciences. In this article, we report on the evolution of the Toolkit over the last ten years, focusing on the expansion of the tool repertoire (e.g. CS-BLAST, HHblits) and on infrastructural work needed to remain operative in a changing web environment. PMID:27131380

  17. Bioinformatic Analysis of Putative Gene Products Encoded in SARS-HCoV Genome

    Institute of Scientific and Technical Information of China (English)

    赵心刚; 韩敬东; 宁元亨; 孟安明; 陈晔光

    2003-01-01

    The cause of severe acute respiratory syndrome (SARS) has been identified as a new coronavirus named as SARS-HCoV.Using bioinformatic methods, we have performed a detailed domain search.In addition to the viral structure proteins, we have found that several putative polypeptides share sequence similarity to known domains or proteins.This study may provide a basis for future studies on the infection and replication process of this notorious virus.

  18. Proteomic and bioinformatic analysis of epithelial tight junction reveals an unexpected cluster of synaptic molecules

    Directory of Open Access Journals (Sweden)

    Tang Vivian W

    2006-12-01

    Full Text Available Abstract Background Zonula occludens, also known as the tight junction, is a specialized cell-cell interaction characterized by membrane "kisses" between epithelial cells. A cytoplasmic plaque of ~100 nm corresponding to a meshwork of densely packed proteins underlies the tight junction membrane domain. Due to its enormous size and difficulties in obtaining a biochemically pure fraction, the molecular composition of the tight junction remains largely unknown. Results A novel biochemical purification protocol has been developed to isolate tight junction protein complexes from cultured human epithelial cells. After identification of proteins by mass spectroscopy and fingerprint analysis, candidate proteins are scored and assessed individually. A simple algorithm has been devised to incorporate transmembrane domains and protein modification sites for scoring membrane proteins. Using this new scoring system, a total of 912 proteins have been identified. These 912 hits are analyzed using a bioinformatics approach to bin the hits in 4 categories: configuration, molecular function, cellular function, and specialized process. Prominent clusters of proteins related to the cytoskeleton, cell adhesion, and vesicular traffic have been identified. Weaker clusters of proteins associated with cell growth, cell migration, translation, and transcription are also found. However, the strongest clusters belong to synaptic proteins and signaling molecules. Localization studies of key components of synaptic transmission have confirmed the presence of both presynaptic and postsynaptic proteins at the tight junction domain. To correlate proteomics data with structure, the tight junction has been examined using electron microscopy. This has revealed many novel structures including end-on cytoskeletal attachments, vesicles fusing/budding at the tight junction membrane domain, secreted substances encased between the tight junction kisses, endocytosis of tight junction

  19. Bioinformatics analysis of differentially expressed proteins in prostate cancer based on proteomics data

    Directory of Open Access Journals (Sweden)

    Chen C

    2016-03-01

    Full Text Available Chen Chen,1 Li-Guo Zhang,1 Jian Liu,1 Hui Han,1 Ning Chen,1 An-Liang Yao,1 Shao-San Kang,1 Wei-Xing Gao,1 Hong Shen,2 Long-Jun Zhang,1 Ya-Peng Li,1 Feng-Hong Cao,1 Zhi-Guo Li3 1Department of Urology, North China University of Science and Technology Affiliated Hospital, 2Department of Modern Technology and Education Center, 3Department of Medical Research Center, International Science and Technology Cooperation Base of Geriatric Medicine, North China University of Science and Technology, Tangshan, People’s Republic of China Abstract: We mined the literature for proteomics data to examine the occurrence and metastasis of prostate cancer (PCa through a bioinformatics analysis. We divided the differentially expressed proteins (DEPs into two groups: the group consisting of PCa and benign tissues (P&b and the group presenting both high and low PCa metastatic tendencies (H&L. In the P&b group, we found 320 DEPs, 20 of which were reported more than three times, and DES was the most commonly reported. Among these DEPs, the expression levels of FGG, GSN, SERPINC1, TPM1, and TUBB4B have not yet been correlated with PCa. In the H&L group, we identified 353 DEPs, 13 of which were reported more than three times. Among these DEPs, MDH2 and MYH9 have not yet been correlated with PCa metastasis. We further confirmed that DES was differentially expressed between 30 cancer and 30 benign tissues. In addition, DEPs associated with protein transport, regulation of actin cytoskeleton, and the extracellular matrix (ECM–receptor interaction pathway were prevalent in the H&L group and have not yet been studied in detail in this context. Proteins related to homeostasis, the wound-healing response, focal adhesions, and the complement and coagulation pathways were overrepresented in both groups. Our findings suggest that the repeatedly reported DEPs in the two groups may function as potential biomarkers for detecting PCa and predicting its aggressiveness. Furthermore

  20. Microarray-bioinformatics analysis of altered genomic expression profiles between human fetal and infant myocardium

    Institute of Scientific and Technical Information of China (English)

    KONG Bo; LIU Ying-long; L(U) Xiao-dong

    2008-01-01

    Background The physiological differences between fetal and postnatal heart have been well characterized at the cellular level. However, the genetic mechanisms governing and regulating these differences have only been partially elucidated. Elucidation of the differentially expressed genes profile before and after birth has never been systematically proposed and analyzed.Methods The human oligonuclectide microarray and bioinformatics analysis approaches were applied to isolate and classify the differentially expressed genes between fetal and infant cardiac tissue samples. Quantitative real-time PCR was used to confirm the results from the microarray.Results Two hundred and forty-two differentially expressed genes were discovered and classified into 13 categories, including genes related to energy metabolism, myocyte hyperplasia, development, muscle contraction, protein synthesis and degradation, extraceUular matrix components, transcription factors, apoptosis, signal pathway molecules, organelle organization and several other biological processes. Moreover, 95 genes were identified which had not previously been reported to be expressed in the heart.Conclusions The study systematically analyzed the alteration of the gene expression profile between the human fetal and infant myocardium. A number of genes were discovered which had not been reported to be expressed in the heart. The data provided insight into the physical development mechanisms of the heart before and after birth.KONG Bo and LU Xiao-dong contributed equally to this study.

  1. R/parallel - speeding up bioinformatics analysis with R

    NARCIS (Netherlands)

    Vera, Gonzalo; Jansen, Ritsert C.; Suppi, Remo L.

    2008-01-01

    Background: R is the preferred tool for statistical analysis of many bioinformaticians due in part to the increasing number of freely available analytical methods. Such methods can be quickly reused and adapted to each particular experiment. However, in experiments where large amounts of data are ge

  2. Deep Sequencing Analysis of Nucleolar Small RNAs: Bioinformatics.

    Science.gov (United States)

    Bai, Baoyan; Laiho, Marikki

    2016-01-01

    Small RNAs (size 20-30 nt) of various types have been actively investigated in recent years, and their subcellular compartmentalization and relative concentrations are likely to be of importance to their cellular and physiological functions. Comprehensive data on this subset of the transcriptome can only be obtained by application of high-throughput sequencing, which yields data that are inherently complex and multidimensional, as sequence composition, length, and abundance will all inform to the small RNA function. Subsequent data analysis, hypothesis testing, and presentation/visualization of the results are correspondingly challenging. We have constructed small RNA libraries derived from different cellular compartments, including the nucleolus, and asked whether small RNAs exist in the nucleolus and whether they are distinct from cytoplasmic and nuclear small RNAs, the miRNAs. Here, we present a workflow for analysis of small RNA sequencing data generated by the Ion Torrent PGM sequencer from samples derived from different cellular compartments. PMID:27576724

  3. Bioinformatics and biomarker discovery "Omic" data analysis for personalized medicine

    CERN Document Server

    Azuaje, Francisco

    2010-01-01

    This book is designed to introduce biologists, clinicians and computational researchers to fundamental data analysis principles, techniques and tools for supporting the discovery of biomarkers and the implementation of diagnostic/prognostic systems. The focus of the book is on how fundamental statistical and data mining approaches can support biomarker discovery and evaluation, emphasising applications based on different types of "omic" data. The book also discusses design factors, requirements and techniques for disease screening, diagnostic and prognostic applications. Readers are provided w

  4. Bioinformatic analysis of Entamoeba histolytica SINE1 elements

    Directory of Open Access Journals (Sweden)

    Butcher Sarah A

    2010-05-01

    Full Text Available Abstract Background Invasive amoebiasis, caused by infection with the human parasite Entamoeba histolytica remains a major cause of morbidity and mortality in some less-developed countries. Genetically E. histolytica exhibits a number of unusual features including having approximately 20% of its genome comprised of repetitive elements. These include a number of families of SINEs - non-autonomous elements which can, however, move with the help of partner LINEs. In many eukaryotes SINE mobility has had a profound effect on gene expression; in this study we concentrated on one such element - EhSINE1, looking in particular for evidence of recent transposition. Results EhSINE1s were detected in the newly reassembled E. histolytica genome by searching with a Hidden Markov Model developed to encapsulate the key features of this element; 393 were detected. Examination of their sequences revealed that some had an internal structure showing one to four 26-27 nt repeats. Members of the different classes differ in a number of ways and in particular those with two internal repeats show the properties expected of fairly recently transposed SINEs - they are the most homogeneous in length and sequence, they have the longest (i.e. the least decayed target site duplications and are the most likely to show evidence (in a cDNA library of active transcription. Furthermore we were able to identify 15 EhSINE1s (6 pairs and one triplet which appeared to be identical or very nearly so but inserted into different sites in the genome; these provide good evidence that if mobility has now ceased it has only done so very recently. Conclusions Of the many families of repetitive elements present in the genome of E. histolytica we have examined in detail just one - EhSINE1. We have shown that there is evidence for waves of transposition at different points in the past and no evidence that mobility has entirely ceased. There are many aspects of the biology of this parasite which

  5. Quantitative Analysis of the Trends Exhibited by the Three Interdisciplinary Biological Sciences: Biophysics, Bioinformatics, and Systems Biology.

    Science.gov (United States)

    Kang, Jonghoon; Park, Seyeon; Venkat, Aarya; Gopinath, Adarsh

    2015-12-01

    New interdisciplinary biological sciences like bioinformatics, biophysics, and systems biology have become increasingly relevant in modern science. Many papers have suggested the importance of adding these subjects, particularly bioinformatics, to an undergraduate curriculum; however, most of their assertions have relied on qualitative arguments. In this paper, we will show our metadata analysis of a scientific literature database (PubMed) that quantitatively describes the importance of the subjects of bioinformatics, systems biology, and biophysics as compared with a well-established interdisciplinary subject, biochemistry. Specifically, we found that the development of each subject assessed by its publication volume was well described by a set of simple nonlinear equations, allowing us to characterize them quantitatively. Bioinformatics, which had the highest ratio of publications produced, was predicted to grow between 77% and 93% by 2025 according to the model. Due to the large number of publications produced in bioinformatics, which nearly matches the number published in biochemistry, it can be inferred that bioinformatics is almost equal in significance to biochemistry. Based on our analysis, we suggest that bioinformatics be added to the standard biology undergraduate curriculum. Adding this course to an undergraduate curriculum will better prepare students for future research in biology.

  6. Quantitative Analysis of the Trends Exhibited by the Three Interdisciplinary Biological Sciences: Biophysics, Bioinformatics, and Systems Biology

    Directory of Open Access Journals (Sweden)

    Jonghoon Kang

    2015-08-01

    Full Text Available New interdisciplinary biological sciences like bioinformatics, biophysics, and systems biology have become increasingly relevant in modern science. Many papers have suggested the importance of adding these subjects, particularly bioinformatics, to an undergraduate curriculum; however, most of their assertions have relied on qualitative arguments. In this paper, we will show our metadata analysis of a scientific literature database (PubMed that quantitatively describes the importance of the subjects of bioinformatics, systems biology, and biophysics as compared with a well-established interdisciplinary subject, biochemistry. Specifically, we found that the development of each subject assessed by its publication volume was well described by a set of simple nonlinear equations, allowing us to characterize them quantitatively. Bioinformatics, which had the highest ratio of publications produced, was predicted to grow between 77% and 93% by 2025 according to the model. Due to the large number of publications produced in bioinformatics, which nearly matches the number published in biochemistry, it can be inferred that bioinformatics is almost equal in significance to biochemistry. Based on our analysis, we suggest that bioinformatics be added to the standard biology undergraduate curriculum. Adding this course to an undergraduate curriculum will better prepare students for future research in biology.

  7. Secretome Analysis of Lipid-Induced Insulin Resistance in Skeletal Muscle Cells by a Combined Experimental and Bioinformatics Workflow.

    Science.gov (United States)

    Deshmukh, Atul S; Cox, Juergen; Jensen, Lars Juhl; Meissner, Felix; Mann, Matthias

    2015-11-01

    Skeletal muscle has emerged as an important secretory organ that produces so-called myokines, regulating energy metabolism via autocrine, paracrine, and endocrine actions; however, the nature and extent of the muscle secretome has not been fully elucidated. Mass spectrometry (MS)-based proteomics, in principle, allows an unbiased and comprehensive analysis of cellular secretomes; however, the distinction of bona fide secreted proteins from proteins released upon lysis of a small fraction of dying cells remains challenging. Here we applied highly sensitive MS and streamlined bioinformatics to analyze the secretome of lipid-induced insulin-resistant skeletal muscle cells. Our workflow identified 1073 putative secreted proteins including 32 growth factors, 25 cytokines, and 29 metalloproteinases. In addition to previously reported proteins, we report hundreds of novel ones. Intriguingly, ∼40% of the secreted proteins were regulated under insulin-resistant conditions, including a protein family with signal peptide and EGF-like domain structure that had not yet been associated with insulin resistance. Finally, we report that secretion of IGF and IGF-binding proteins was down-regulated under insulin-resistant conditions. Our study demonstrates an efficient combined experimental and bioinformatics workflow to identify putative secreted proteins from insulin-resistant skeletal muscle cells, which could easily be adapted to other cellular models.

  8. [Cloning and bioinformatic analysis and expression analysis of beta-glucuronidase in Scutellaria baicalensis].

    Science.gov (United States)

    Guo, Shuang-shuang; Cheng, Lin; Yang, Li-min; Han, Mei

    2015-11-01

    The β-Glucuronidase gene (sbGUS) cDNA firstly from Scutellari abaicalensis leaf was cloned by RT-PCR, with GenBank accession number KR364726. The full length cDNA of sbGUS was 1 584 bp with an open reading frame (ORF), encoding an unstable protein with 527 amino acids. The bioinformatic analysis showed that the sbGUS encoding protein had isoelectric point (pI) of 5.55 and a calculated molecular weight about 58.724 8 kDa, with a transmembrane regions and signal peptide, had conserved domains of glycoside hydrolase super family and unintegrated trans-glycosidase catalytic structure. In the secondary structure, the percentage of alpha helix, extended strand, β-extended and random coil were 25.62%, 28.84%, 13.28% and 32.26%, respectively. The homologous analysis indicated the nucleotide sequence 98.93% similarity and the amino acid sequence 98.29% similarity with S. baicalensis (BAA97804.1), in the nine positions were different. The expression level of sGUS was the highest in root based on a real-time PCR analysis, followed by flower and stem, and the lowest was in stem. The results provide a foundation for exploring the molecular function of sbGUS involved in baicalcin biosynthesis based on synthetic biology approach in S. baicalensis plants. PMID:27097409

  9. Introduction to Bioinformatics

    OpenAIRE

    Thampi, Sabu M.

    2009-01-01

    Bioinformatics is a new discipline that addresses the need to manage and interpret the data that in the past decade was massively generated by genomic research. This discipline represents the convergence of genomics, biotechnology and information technology, and encompasses analysis and interpretation of data, modeling of biological phenomena, and development of algorithms and statistics. This article presents an introduction to bioinformatics

  10. Bio-informatics Research Progress in the Post-genome Era Based on the Quantitative Analysis of SCIE

    Institute of Scientific and Technical Information of China (English)

    Yongqin; ZHAN; Min; YU

    2013-01-01

    SCIE paper output can reflect the status quo and trend of discipline research and 7 038 scientific articles concerning bioinformatics are retrieved in SCIE database during the years between 2008 and 2012. Quantitative analysis of paper output and citation frequency are conducted according to nations, institutions, publications, research direction as well as hot articles, which provides assistance for bioinformatics researchers to understand the present situation of this subject, carry out cooperative studies and display scientific research achievements.

  11. Bioinformatics analysis of differentially expressed proteins in prostate cancer based on proteomics data.

    Science.gov (United States)

    Chen, Chen; Zhang, Li-Guo; Liu, Jian; Han, Hui; Chen, Ning; Yao, An-Liang; Kang, Shao-San; Gao, Wei-Xing; Shen, Hong; Zhang, Long-Jun; Li, Ya-Peng; Cao, Feng-Hong; Li, Zhi-Guo

    2016-01-01

    We mined the literature for proteomics data to examine the occurrence and metastasis of prostate cancer (PCa) through a bioinformatics analysis. We divided the differentially expressed proteins (DEPs) into two groups: the group consisting of PCa and benign tissues (P&b) and the group presenting both high and low PCa metastatic tendencies (H&L). In the P&b group, we found 320 DEPs, 20 of which were reported more than three times, and DES was the most commonly reported. Among these DEPs, the expression levels of FGG, GSN, SERPINC1, TPM1, and TUBB4B have not yet been correlated with PCa. In the H&L group, we identified 353 DEPs, 13 of which were reported more than three times. Among these DEPs, MDH2 and MYH9 have not yet been correlated with PCa metastasis. We further confirmed that DES was differentially expressed between 30 cancer and 30 benign tissues. In addition, DEPs associated with protein transport, regulation of actin cytoskeleton, and the extracellular matrix (ECM)-receptor interaction pathway were prevalent in the H&L group and have not yet been studied in detail in this context. Proteins related to homeostasis, the wound-healing response, focal adhesions, and the complement and coagulation pathways were overrepresented in both groups. Our findings suggest that the repeatedly reported DEPs in the two groups may function as potential biomarkers for detecting PCa and predicting its aggressiveness. Furthermore, the implicated biological processes and signaling pathways may help elucidate the molecular mechanisms of PCa carcinogenesis and metastasis and provide new targets for clinical treatment. PMID:27051295

  12. Bioinformatics analysis of differentially expressed proteins in prostate cancer based on proteomics data

    Science.gov (United States)

    Chen, Chen; Zhang, Li-Guo; Liu, Jian; Han, Hui; Chen, Ning; Yao, An-Liang; Kang, Shao-San; Gao, Wei-Xing; Shen, Hong; Zhang, Long-Jun; Li, Ya-Peng; Cao, Feng-Hong; Li, Zhi-Guo

    2016-01-01

    We mined the literature for proteomics data to examine the occurrence and metastasis of prostate cancer (PCa) through a bioinformatics analysis. We divided the differentially expressed proteins (DEPs) into two groups: the group consisting of PCa and benign tissues (P&b) and the group presenting both high and low PCa metastatic tendencies (H&L). In the P&b group, we found 320 DEPs, 20 of which were reported more than three times, and DES was the most commonly reported. Among these DEPs, the expression levels of FGG, GSN, SERPINC1, TPM1, and TUBB4B have not yet been correlated with PCa. In the H&L group, we identified 353 DEPs, 13 of which were reported more than three times. Among these DEPs, MDH2 and MYH9 have not yet been correlated with PCa metastasis. We further confirmed that DES was differentially expressed between 30 cancer and 30 benign tissues. In addition, DEPs associated with protein transport, regulation of actin cytoskeleton, and the extracellular matrix (ECM)–receptor interaction pathway were prevalent in the H&L group and have not yet been studied in detail in this context. Proteins related to homeostasis, the wound-healing response, focal adhesions, and the complement and coagulation pathways were overrepresented in both groups. Our findings suggest that the repeatedly reported DEPs in the two groups may function as potential biomarkers for detecting PCa and predicting its aggressiveness. Furthermore, the implicated biological processes and signaling pathways may help elucidate the molecular mechanisms of PCa carcinogenesis and metastasis and provide new targets for clinical treatment. PMID:27051295

  13. Bioinformatics Identification of Modules of Transcription Factor Binding Sites in Alzheimer's Disease-Related Genes by In Silico Promoter Analysis and Microarrays

    Directory of Open Access Journals (Sweden)

    Regina Augustin

    2011-01-01

    Full Text Available The molecular mechanisms and genetic risk factors underlying Alzheimer's disease (AD pathogenesis are only partly understood. To identify new factors, which may contribute to AD, different approaches are taken including proteomics, genetics, and functional genomics. Here, we used a bioinformatics approach and found that distinct AD-related genes share modules of transcription factor binding sites, suggesting a transcriptional coregulation. To detect additional coregulated genes, which may potentially contribute to AD, we established a new bioinformatics workflow with known multivariate methods like support vector machines, biclustering, and predicted transcription factor binding site modules by using in silico analysis and over 400 expression arrays from human and mouse. Two significant modules are composed of three transcription factor families: CTCF, SP1F, and EGRF/ZBPF, which are conserved between human and mouse APP promoter sequences. The specific combination of in silico promoter and multivariate analysis can identify regulation mechanisms of genes involved in multifactorial diseases.

  14. Applying Instructional Design Theories to Bioinformatics Education in Microarray Analysis and Primer Design Workshops

    Science.gov (United States)

    Shachak, Aviv; Ophir, Ron; Rubin, Eitan

    2005-01-01

    The need to support bioinformatics training has been widely recognized by scientists, industry, and government institutions. However, the discussion of instructional methods for teaching bioinformatics is only beginning. Here we report on a systematic attempt to design two bioinformatics workshops for graduate biology students on the basis of…

  15. Expression and Bioinformatics Analysis of SPACA4 in Human and Mice

    Institute of Scientific and Technical Information of China (English)

    Ai-fa TANG; Zhen-dong YU; Yao-ting GUI; Xin GUO; Xian-xin LI; Wei-xiang LIU; Hui ZHU; Zhi-ming CAI

    2008-01-01

    Objective To analyze the expression of SPACA4 in human and mice. Methods Testes cRNA samples from Balb/c mice of different postnatal days were performed with mouse affymetrix chip to screen the expression of SPACA4 in mice. Sub-quantitative RT-PCR and bioinformatic tools were used here to describe the expression profile of SPACA4 in mice and human. Results The results of gene chip analysis indicated that the expression of mSPACA4 began after d 35 of postnatal testis in mice. Sub-quantitative RT-PCR assay showed that SPACA4 gene expressed exclusively in mouse and human testis, and mouse mSPACA4 gene expressed after d 35 of postnatal testis that was consistency with the results of gene chip analysis. By bioinformatics analysis, mSPACA4 is located in cell membrane (34.8%) or plasma membrane (34.8%), the signal peptide cleavage site between position 19 and 20 amino acids, transmembrane region between 2-20 and 101-126 amino acids, respectively, on mSPACA4 protein. Conclusion mSPACA4 and hSPACA4 were testis-specific genes, and the expression of mSPACA4 begins after d 35 of postnatal testis in mice. SPACA4 is a candidate for targeting in a sperm-based contraceptive vaccine.

  16. Integrative genomic analysis by interoperation of bioinformatics tools in GenomeSpace

    Science.gov (United States)

    Thorvaldsdottir, Helga; Liefeld, Ted; Ocana, Marco; Borges-Rivera, Diego; Pochet, Nathalie; Robinson, James T.; Demchak, Barry; Hull, Tim; Ben-Artzi, Gil; Blankenberg, Daniel; Barber, Galt P.; Lee, Brian T.; Kuhn, Robert M.; Nekrutenko, Anton; Segal, Eran; Ideker, Trey; Reich, Michael; Regev, Aviv; Chang, Howard Y.; Mesirov, Jill P.

    2015-01-01

    Integrative analysis of multiple data types to address complex biomedical questions requires the use of multiple software tools in concert and remains an enormous challenge for most of the biomedical research community. Here we introduce GenomeSpace (http://www.genomespace.org), a cloud-based, cooperative community resource. Seeded as a collaboration of six of the most popular genomics analysis tools, GenomeSpace now supports the streamlined interaction of 20 bioinformatics tools and data resources. To facilitate the ability of non-programming users’ to leverage GenomeSpace in integrative analysis, it offers a growing set of ‘recipes’, short workflows involving a few tools and steps to guide investigators through high utility analysis tasks. PMID:26780094

  17. SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data

    Science.gov (United States)

    Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).

  18. BioGPS descriptors for rational engineering of enzyme promiscuity and structure based bioinformatic analysis.

    Directory of Open Access Journals (Sweden)

    Valerio Ferrario

    Full Text Available A new bioinformatic methodology was developed founded on the Unsupervised Pattern Cognition Analysis of GRID-based BioGPS descriptors (Global Positioning System in Biological Space. The procedure relies entirely on three-dimensional structure analysis of enzymes and does not stem from sequence or structure alignment. The BioGPS descriptors account for chemical, geometrical and physical-chemical features of enzymes and are able to describe comprehensively the active site of enzymes in terms of "pre-organized environment" able to stabilize the transition state of a given reaction. The efficiency of this new bioinformatic strategy was demonstrated by the consistent clustering of four different Ser hydrolases classes, which are characterized by the same active site organization but able to catalyze different reactions. The method was validated by considering, as a case study, the engineering of amidase activity into the scaffold of a lipase. The BioGPS tool predicted correctly the properties of lipase variants, as demonstrated by the projection of mutants inside the BioGPS "roadmap".

  19. SweetNET: A Bioinformatics Workflow for Glycopeptide MS/MS Spectral Analysis.

    Science.gov (United States)

    Nasir, Waqas; Toledo, Alejandro Gomez; Noborn, Fredrik; Nilsson, Jonas; Wang, Mingxun; Bandeira, Nuno; Larson, Göran

    2016-08-01

    Glycoproteomics has rapidly become an independent analytical platform bridging the fields of glycomics and proteomics to address site-specific protein glycosylation and its impact in biology. Current glycopeptide characterization relies on time-consuming manual interpretations and demands high levels of personal expertise. Efficient data interpretation constitutes one of the major challenges to be overcome before true high-throughput glycopeptide analysis can be achieved. The development of new glyco-related bioinformatics tools is thus of crucial importance to fulfill this goal. Here we present SweetNET: a data-oriented bioinformatics workflow for efficient analysis of hundreds of thousands of glycopeptide MS/MS-spectra. We have analyzed MS data sets from two separate glycopeptide enrichment protocols targeting sialylated glycopeptides and chondroitin sulfate linkage region glycopeptides, respectively. Molecular networking was performed to organize the glycopeptide MS/MS data based on spectral similarities. The combination of spectral clustering, oxonium ion intensity profiles, and precursor ion m/z shift distributions provided typical signatures for the initial assignment of different N-, O- and CS-glycopeptide classes and their respective glycoforms. These signatures were further used to guide database searches leading to the identification and validation of a large number of glycopeptide variants including novel deoxyhexose (fucose) modifications in the linkage region of chondroitin sulfate proteoglycans. PMID:27399812

  20. Bioinformatics Analysis for Coding SNPs of the HLADQA1 Gene Involved in Susceptibility to Cervical Cancer

    Institute of Scientific and Technical Information of China (English)

    Yanyun Li; Jun Xing; Linsheng Zhao; Yanni Li; Yuchuan Wang; Weiming Zhang

    2006-01-01

    OBJECTIVE To analyze coding SNPs of the HLA-DQA1 gene involved in susceptibility for cervical cancer by a bioinformatics approach, and to choose some SNPs that may have an association with cervical cancer.METHODS By a SNPper tool we extracted SNPs from a public database (dbSNP), exporting them in FASTA formats suitable for subsequent use.Then we used PARSESNP as a tool for the analysis of the cSNPs.RESULTS In the cSNPs of the HLA-DQA1 gene, we find that rs9272693and rs9272703, are made up of missense mutations which convert a codon for one amino acid into a codon for a different amino acid. We chose a PSSM Difference >10 as a lower level for the scores of changes predicted to be deldterious.CONCLUSION We used a bioinformatics approach for cSNPs analysis of the HLA-DQA1 gene. This method can select the variants in a conserved region, and give a PSSM Difference score. But the results need to be verified in cervical cancer patients and a control population.

  1. SweetNET: A Bioinformatics Workflow for Glycopeptide MS/MS Spectral Analysis.

    Science.gov (United States)

    Nasir, Waqas; Toledo, Alejandro Gomez; Noborn, Fredrik; Nilsson, Jonas; Wang, Mingxun; Bandeira, Nuno; Larson, Göran

    2016-08-01

    Glycoproteomics has rapidly become an independent analytical platform bridging the fields of glycomics and proteomics to address site-specific protein glycosylation and its impact in biology. Current glycopeptide characterization relies on time-consuming manual interpretations and demands high levels of personal expertise. Efficient data interpretation constitutes one of the major challenges to be overcome before true high-throughput glycopeptide analysis can be achieved. The development of new glyco-related bioinformatics tools is thus of crucial importance to fulfill this goal. Here we present SweetNET: a data-oriented bioinformatics workflow for efficient analysis of hundreds of thousands of glycopeptide MS/MS-spectra. We have analyzed MS data sets from two separate glycopeptide enrichment protocols targeting sialylated glycopeptides and chondroitin sulfate linkage region glycopeptides, respectively. Molecular networking was performed to organize the glycopeptide MS/MS data based on spectral similarities. The combination of spectral clustering, oxonium ion intensity profiles, and precursor ion m/z shift distributions provided typical signatures for the initial assignment of different N-, O- and CS-glycopeptide classes and their respective glycoforms. These signatures were further used to guide database searches leading to the identification and validation of a large number of glycopeptide variants including novel deoxyhexose (fucose) modifications in the linkage region of chondroitin sulfate proteoglycans.

  2. Importance of databases of nucleic acids for bioinformatic analysis focused to genomics

    Science.gov (United States)

    Jimenez-Gutierrez, L. R.; Barrios-Hernández, C. J.; Pedraza-Ferreira, G. R.; Vera-Cala, L.; Martinez-Perez, F.

    2016-08-01

    Recently, bioinformatics has become a new field of science, indispensable in the analysis of millions of nucleic acids sequences, which are currently deposited in international databases (public or private); these databases contain information of genes, RNA, ORF, proteins, intergenic regions, including entire genomes from some species. The analysis of this information requires computer programs; which were renewed in the use of new mathematical methods, and the introduction of the use of artificial intelligence. In addition to the constant creation of supercomputing units trained to withstand the heavy workload of sequence analysis. However, it is still necessary the innovation on platforms that allow genomic analyses, faster and more effectively, with a technological understanding of all biological processes.

  3. Predicting the Nuclear Localization Signals of 107 Types of HPV L1 Proteins by Bioinformatic Analysis

    Institute of Scientific and Technical Information of China (English)

    Jun Yang; Yi-Li Wang; Lü-Sheng Si

    2006-01-01

    In this study, 107 types of human papillomavirus (HPV) L1 protein sequences were obtained from available databases, and the nuclear localization signals (NLSs) of these HPV L1 proteins were analyzed and predicted by bioinformatic analysis.Out of the 107 types, the NLSs of 39 types were predicted by PredictNLS software (35 types of bipartite NLSs and 4 types of monopartite NLSs). The NLSs of the remaining HPV types were predicted according to the characteristics and the homology of the already predicted NLSs as well as the general rule of NLSs.According to the result, the NLSs of 107 types of HPV L1 proteins were classified into 15 categories. The different types of HPV L1 proteins in the same NLS category could share the similar or the same nucleocytoplasmic transport pathway.They might be used as the same target to prevent and treat different types of HPV infection. The results also showed that bioinformatic technology could be used to analyze and predict NLSs of proteins.

  4. Bioinformatic Analysis of Differential Protein Expression in Calu-3 Cells Exposed to Carbon Nanotubes

    Directory of Open Access Journals (Sweden)

    Pin Li

    2013-10-01

    Full Text Available Carbon nanomaterials are widely produced and used in industry, medicine and scientific research. To examine the impact of exposure to nanoparticles on human health, the human airway epithelial cell line, Calu-3, was used to evaluate changes in the cellular proteome that could account for alterations in cellular function of airway epithelia after 24 hexposure to 10 μg/mL and 100 ng/mLof two common carbon nanoparticles, single- and multi-wall carbon nanotubes (SWCNT, MWCNT. After exposure to the nanoparticles, label-free quantitative mass spectrometry (LFQMS was used to study the differential protein expression. Ingenuity Pathway Analysis (IPA was used to conduct a bioinformaticanalysis of proteins identified in LFQMS. Interestingly, after exposure to ahigh concentration (10 mg/mL; 0.4 mg/cm2 of MWCNT or SWCNT, only 8 and 13 proteins, respectively, exhibited changes in abundance. In contrast, the abundance of hundreds of proteins was altered in response to a low concentration (100 ng/mL; 4 ng/cm2 of either CNT. Of the 281 and 282 proteins that were significantly altered in response to MWCNT or SWCNT respectively, 231 proteins were the same. Bioinformatic analyses found that the proteins in common to both nanotubes occurred within the cellular functions of cell death and survival, cell-to-cell signaling and interaction, cellular assembly and organization, cellular growth and proliferation, infectious disease, molecular transport and protein synthesis. The majority of the protein changes represent a decrease in amount suggesting a general stress response to protect cells. The STRING database was used to analyze the various functional protein networks. Interestingly, some proteins like cadherin 1 (CDH1, signal transducer and activator of transcription 1 (STAT1, junction plakoglobin (JUP, and apoptosis-associated speck-like protein containing a CARD (PYCARD, appear in several functional categories and tend to be in the center of the networks. This

  5. Identification of new serum markers of pathological states by bioinformatic tools for the analysis of serum proteomics expression profiles

    International Nuclear Information System (INIS)

    We have developed new bioinformatic tools and strategies, aimed to the identification and characterization of proteins as markers of pathological states, for the analysis of data derived from protein expression profiles obtained by mass spectrometry techniques, for the study of structural and functional properties of the proteins, and for the analysis of data from omics approaches

  6. Identification of Differentially Expressed Genes in Kawasaki Disease Patients as Potential Biomarkers for IVIG Sensitivity by Bioinformatics Analysis.

    Science.gov (United States)

    He, Lan; Sheng, Youyu; Huang, Chunyun; Huang, Guoying

    2016-08-01

    Kawasaki disease (KD) is a leading cause of acquired heart disease predominantly affecting infants and young children. Intravenous immunoglobulin (IVIG) is applied as the most favorable treatment against KD, but IVIG resistant remains exist. Although several clinical scoring systems have been developed to identify children at highest risk of IVIG resistance, there is a need to identify sufficiently sensitive biomarkers for IVIG treatment. Some differentially expressed genes (DEGs) could be the promising potential biomarkers for IVIG-related sensitivity diagnosis. We employed a systematic and integrative bioinformatics framework to identify such kind of genes. The performance of the candidate genes was evaluated by hierarchical clustering, ROC analysis and literature mining. By analyzing three datasets of KD patients, 34 DEGs of the three groups have been found to be associated with IVIG-related sensitivity. A module of 12 genes could predict resistant group patients with high accuracy, and a module of ten genes could predict responsive group patients effectively with accuracy of 96 %. And three of them are most likely to serve as drug targets or diagnostic biomarkers in the future. Compared with unsupervised hierarchical clustering analysis, our modules could distinct IVIG-resistant patients efficiently. Two groups of DEGs could predict IVIG-related sensitivity with high accuracy, which are potential biomarkers for the clinical diagnosis and prediction of IVIG treatment response in KD patients, improving the prognosis of patients.

  7. Experimental Design and Bioinformatics Analysis for the Application of Metagenomics in Environmental Sciences and Biotechnology.

    Science.gov (United States)

    Ju, Feng; Zhang, Tong

    2015-11-01

    Recent advances in DNA sequencing technologies have prompted the widespread application of metagenomics for the investigation of novel bioresources (e.g., industrial enzymes and bioactive molecules) and unknown biohazards (e.g., pathogens and antibiotic resistance genes) in natural and engineered microbial systems across multiple disciplines. This review discusses the rigorous experimental design and sample preparation in the context of applying metagenomics in environmental sciences and biotechnology. Moreover, this review summarizes the principles, methodologies, and state-of-the-art bioinformatics procedures, tools and database resources for metagenomics applications and discusses two popular strategies (analysis of unassembled reads versus assembled contigs/draft genomes) for quantitative or qualitative insights of microbial community structure and functions. Overall, this review aims to facilitate more extensive application of metagenomics in the investigation of uncultured microorganisms, novel enzymes, microbe-environment interactions, and biohazards in biotechnological applications where microbial communities are engineered for bioenergy production, wastewater treatment, and bioremediation.

  8. Bioinformatic analysis ofhuman nuclear receptornr5a2(hblf) genomic sequence

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    We have cloned the cDNA of human nuclear receptor nrSa2(hb1f) gene and obtained its whole genomic sequence previously. In this work we carried out in-depth bioinformatic analysis on the genomic sequence of nrSa2(hb1f) gene. Sequence comparison and prediction algorithms implicated that there might be additional coding regions in the 210 kb genomic sequence besides known exons,especially in the two largest introns. Comparison of the structures of nr5a loci in different species revealed distinguishable conservation and apparent gene duplication during evolution. The remarkable conservation among promoters of zebrafish, mouse and human nr5a2 genes suggested that they would be regulated by the same transcription factors.

  9. Bioinformatics analysis of breast cancer bone metastasis related gene-CXCR4

    Institute of Scientific and Technical Information of China (English)

    Heng-Wei Zhang; Xian-Fu Sun; Ya-Ning He; Jun-Tao Li; Xu-Hui Guo; Hui Liu

    2013-01-01

    Objective: To analyze breast cancer bone metastasis related gene-CXCR4. Methods: This research screened breast cancer bone metastasis related genes by high-flux gene chip. Results:It was found that the expressions of 396 genes were different including 165 up-regulations and 231 down-regulations. The expression of chemokine receptor CXCR4 was obviously up-regulated in the tissue with breast cancer bone metastasis. Compared with the tissue without bone metastasis, there was significant difference, which indicated that CXCR4 played a vital role in breast cancer bone metastasis. Conclusions: The bioinformatics analysis of CXCR4 can provide a certain basis for the occurrence and diagnosis of breast cancer bone metastasis, target gene therapy and evaluation of prognosis.

  10. Bioinformatics investigation of therapeutic mechanisms of Xuesaitong capsule treating ischemic cerebrovascular rat model with comparative transcriptome analysis

    Science.gov (United States)

    Liao, Jiangquan; Wei, Benjun; Chen, Hengwen; Liu, Yongmei; Wang, Jie

    2016-01-01

    Background: Xuesaitong soft capsule (XST) which consists of panax notoginseng saponin (PNS) has been used to treat ischemic cerebrovascular diseases in China. The therapeutic mechanism of XST has not been elucidated yet from prospective of genomics and bioinformatics. Methods: A transcriptome analysis was performed to review series concerning middle cerebral artery occlusion (MCAO) rat model and XST intervention after MCAO from Gene Expression Omnibus (GEO) database. Differentially expressed genes (DEGs) were compared between blank group and model group, model group and XST group. Functional enrichment and pathway analysis were performed. Protein-Protein interaction network was constructed. The overlapping genes from two DEGs sets were screened out and profound analysis was performed. Results: Two series including 22 samples were obtained. 870 DEGs were identified between blank group and model group, and 1189 DEGs were identified between model group and XST group. GO terms and KEGG pathways of MCAO and XST intervention were significantly enriched. PPI networks were constructed to demonstrate the gene-gene interactions. The overlapping genes from two DEGs sets were highlighted. ANTXR2, FHL3, PRCP, TYROBP, TAF9B, FGFR2, BCL11B, RB1CC1 and MBNL2 were the pivotal genes and possible action sites of XST therapeutic mechanisms. Conclusion: MCAO is a pathological process with multiple. PMID:27347353

  11. Identification of novel highly expressed genes in pancreatic ductal adenocarcinomas through a bioinformatics analysis of expressed sequence tags.

    Science.gov (United States)

    Cao, Dengfeng; Hustinx, Steven R; Sui, Guoping; Bala, P; Sato, Norihiro; Martin, Sean; Maitra, Anirban; Murphy, Kathleen M; Cameron, John L; Yeo, Charles J; Kern, Scott E; Goggins, Michael; Pandey, Akhilesh; Hruban, Ralph H

    2004-11-01

    In most microarray experiments, a significant fraction of the differentially expressed mRNAs identified correspond to expressed sequence tags (ESTs) and are generally discarded from further analyses. We used careful bioinformatics analyses to characterize those ESTs that were found to be highly overexpressed in a series of pancreatic adenocarcinomas. cDNA was prepared from 60 non-neoplastic samples (normal pancreas [n = 20], normal colon [n = 10], or normal duodenal mucosal [n = 30]) and from 64 pancreatic cancers (resected cancers [n = 50] or cancer cell lines [n = 14]) and hybridized to the complete Affymetrix Human Genome U133 GeneChip(R) set (arrays U133A and B) for simultaneous analysis of 45,000 fragments corresponding to 33,000 known genes and 6,000 ESTs. The GeneExpress(R) software system Fold Change Analysis Tool was used and 60 ESTs were identified that were expressed at levels at least 3-fold greater in the pancreatic cancers as compared to normal tissues. Searches against the human genomic sequence and comparative genomic analysis of human and mouse genomes was carried out using basic local alignment search tools (BLAST), BLASTN, and BLASTX, for identifying protein coding genes corresponding to the ESTs. Subsequently, in order to pick the most relevant candidate genes for a more detailed analysis, we looked for domains/motifs in the open reading frames using SMART and Pfam programs. We were able to definitively map 43 of the 60 ESTs to known or novel genes, and 15 of the ESTs could be localized in close proximity to a gene in the human genome although we were unable to establish that the EST was indeed derived from those genes. The differential expression of a subset of genes was confirmed at the protein level by immunohistochemical labeling of tissue microarrays (inhibin beta A [INHBA] and CD29) and/or at the transcript level by RT-PCR (INHBA, AKAP12, ELK3, FOXQ1, EIF5A2, and EFNA5). We conclude that bioinformatics tools can be used to characterize

  12. Flow cytometry bioinformatics.

    Directory of Open Access Journals (Sweden)

    Kieran O'Neill

    Full Text Available Flow cytometry bioinformatics is the application of bioinformatics to flow cytometry data, which involves storing, retrieving, organizing, and analyzing flow cytometry data using extensive computational resources and tools. Flow cytometry bioinformatics requires extensive use of and contributes to the development of techniques from computational statistics and machine learning. Flow cytometry and related methods allow the quantification of multiple independent biomarkers on large numbers of single cells. The rapid growth in the multidimensionality and throughput of flow cytometry data, particularly in the 2000s, has led to the creation of a variety of computational analysis methods, data standards, and public databases for the sharing of results. Computational methods exist to assist in the preprocessing of flow cytometry data, identifying cell populations within it, matching those cell populations across samples, and performing diagnosis and discovery using the results of previous steps. For preprocessing, this includes compensating for spectral overlap, transforming data onto scales conducive to visualization and analysis, assessing data for quality, and normalizing data across samples and experiments. For population identification, tools are available to aid traditional manual identification of populations in two-dimensional scatter plots (gating, to use dimensionality reduction to aid gating, and to find populations automatically in higher dimensional space in a variety of ways. It is also possible to characterize data in more comprehensive ways, such as the density-guided binary space partitioning technique known as probability binning, or by combinatorial gating. Finally, diagnosis using flow cytometry data can be aided by supervised learning techniques, and discovery of new cell types of biological importance by high-throughput statistical methods, as part of pipelines incorporating all of the aforementioned methods. Open standards, data

  13. Bioinformatics Knowledge Map for Analysis of Beta-Catenin Function in Cancer.

    Directory of Open Access Journals (Sweden)

    İrem Çelen

    Full Text Available Given the wealth of bioinformatics resources and the growing complexity of biological information, it is valuable to integrate data from disparate sources to gain insight into the role of genes/proteins in health and disease. We have developed a bioinformatics framework that combines literature mining with information from biomedical ontologies and curated databases to create knowledge "maps" of genes/proteins of interest. We applied this approach to the study of beta-catenin, a cell adhesion molecule and transcriptional regulator implicated in cancer. The knowledge map includes post-translational modifications (PTMs, protein-protein interactions, disease-associated mutations, and transcription factors co-activated by beta-catenin and their targets and captures the major processes in which beta-catenin is known to participate. Using the map, we generated testable hypotheses about beta-catenin biology in normal and cancer cells. By focusing on proteins participating in multiple relation types, we identified proteins that may participate in feedback loops regulating beta-catenin transcriptional activity. By combining multiple network relations with PTM proteoform-specific functional information, we proposed a mechanism to explain the observation that the cyclin dependent kinase CDK5 positively regulates beta-catenin co-activator activity. Finally, by overlaying cancer-associated mutation data with sequence features, we observed mutation patterns in several beta-catenin PTM sites and PTM enzyme binding sites that varied by tissue type, suggesting multiple mechanisms by which beta-catenin mutations can contribute to cancer. The approach described, which captures rich information for molecular species from genes and proteins to PTM proteoforms, is extensible to other proteins and their involvement in disease.

  14. Bioinformatics Knowledge Map for Analysis of Beta-Catenin Function in Cancer.

    Science.gov (United States)

    Çelen, İrem; Ross, Karen E; Arighi, Cecilia N; Wu, Cathy H

    2015-01-01

    Given the wealth of bioinformatics resources and the growing complexity of biological information, it is valuable to integrate data from disparate sources to gain insight into the role of genes/proteins in health and disease. We have developed a bioinformatics framework that combines literature mining with information from biomedical ontologies and curated databases to create knowledge "maps" of genes/proteins of interest. We applied this approach to the study of beta-catenin, a cell adhesion molecule and transcriptional regulator implicated in cancer. The knowledge map includes post-translational modifications (PTMs), protein-protein interactions, disease-associated mutations, and transcription factors co-activated by beta-catenin and their targets and captures the major processes in which beta-catenin is known to participate. Using the map, we generated testable hypotheses about beta-catenin biology in normal and cancer cells. By focusing on proteins participating in multiple relation types, we identified proteins that may participate in feedback loops regulating beta-catenin transcriptional activity. By combining multiple network relations with PTM proteoform-specific functional information, we proposed a mechanism to explain the observation that the cyclin dependent kinase CDK5 positively regulates beta-catenin co-activator activity. Finally, by overlaying cancer-associated mutation data with sequence features, we observed mutation patterns in several beta-catenin PTM sites and PTM enzyme binding sites that varied by tissue type, suggesting multiple mechanisms by which beta-catenin mutations can contribute to cancer. The approach described, which captures rich information for molecular species from genes and proteins to PTM proteoforms, is extensible to other proteins and their involvement in disease. PMID:26509276

  15. Nano-LC-ESI MS/MS analysis of proteins in dried sea dragon Solenognathus hardwickii and bioinformatic analysis of its protein expression profiling.

    Science.gov (United States)

    Zhang, Dong-Mei; Feng, Li-Xing; Li, Lu; Liu, Miao; Jiang, Bao-Hong; Yang, Min; Li, Guo-Qiang; Wu, Wan-Ying; Guo, De-An; Liu, Xuan

    2016-09-01

    The sea dragon Solenognathus hardwickii has long been used as a traditional Chinese medicine for the treatment of various diseases, such as male impotency. To gain a comprehensive insight into the protein components of the sea dragon, shotgun proteomic analysis of its protein expression profiling was conducted in the present study. Proteins were extracted from dried sea dragon using a trichloroacetic acid/acetone precipitation method and then separated by SDS-PAGE. The protein bands were cut from the gel and digested by trypsin to generate peptide mixture. The peptide fragments were then analyzed using nano liquid chromatography tandem mass spectrometry (nano-LC-ESI MS/MS). 810 proteins and 1 577 peptides were identified in the dried sea dragon. The identified proteins exhibited molecular weight values ranging from 1 900 to 3 516 900 Da and pI values from 3.8 to 12.18. Bioinformatic analysis was conducted using the DAVID Bioinformatics Resources 6.7 Gene Ontology (GO) analysis tool to explore possible functions of the identified proteins. Ascribed functions of the proteins mainly included intracellular non-membrane-bound organelle, non-membrane-bounded organelle, cytoskeleton, structural molecule activity, calcium ion binding and etc. Furthermore, possible signal networks of the identified proteins were predicted using STRING (Search Tool for the Retrieval of Interacting Genes) database. Ribosomal protein synthesis was found to play an important role in the signal network. The results of this study, to best of our knowledge, were the first to provide a reference proteome profile for the sea dragon, and would aid in the understanding of the expression and functions of the identified proteins.

  16. Bioinformatics analysis suggests base modifications of tRNAs and miRNAs in Arabidopsis thaliana

    Directory of Open Access Journals (Sweden)

    Jin Hailing

    2009-04-01

    Full Text Available Abstract Background Modifications of RNA bases have been found in some mRNAs and non-coding RNAs including rRNAs, tRNAs, and snRNAs, where modified bases are important for RNA function. Little is known about RNA base modifications in Arabidopsis thaliana. Results In the current work, we carried out a bioinformatics analysis of RNA base modifications in tRNAs and miRNAs using large numbers of cDNA sequences of small RNAs (sRNAs generated with the 454 technology and the massively parallel signature sequencing (MPSS method. We looked for sRNAs that map to the genome sequence with one-base mismatch (OMM, which indicate candidate modified nucleotides. We obtained 1,187 sites with possible RNA base modifications supported by both 454 and MPSS sequences. Seven hundred and three of these sites were within tRNA loci. Nucleotide substitutions were frequently located in the T arm (substitutions from A to U or G, upstream of the D arm (from G to C, U, or A, and downstream of the D arm (from G to U. The positions of major substitution sites corresponded with the following known RNA base modifications in tRNAs: N1-methyladenosine (m1A, N2-methylguanosine (m2G, and N2-N2-methylguanosine (m22G. Conclusion These results indicate that our bioinformatics method successfully detected modified nucleotides in tRNAs. Using this method, we also found 147 substitution sites in miRNA loci. As with tRNAs, substitutions from A to U or G and from G to C, U, or A were common, suggesting that base modifications might be similar in tRNAs and miRNAs. We suggest that miRNAs contain modified bases and such modifications might be important for miRNA maturation and/or function.

  17. Small envelope protein E of SARS:cloning,expression, purification, CD determination, and bioinformatics analysis

    Institute of Scientific and Technical Information of China (English)

    SHENXu; XUEJian-Hua; YUChang-Ying; LUOHai-Bin; QINLei; YUXiao-Jing; CHENJing; CHENLi-Li; XIONGBin; YUELi-Duo; CAIJian-Hua; SHENJian-Hua; LUOXiao-Min; CHENKai-Xian; SHITie-Liu; LIYi-Xue; HUGeng-Xi; JIANGHua-Liang

    2003-01-01

    AIM:To obtain the pure sample of SARS small envelope E protein (SARS E protein), study its properties and analyze its possible functions. METHODS: The plasmid of SARS E protein was constructed by the polymerase chain reaction (PCR), and the protein was expressed in the E coli strain. The secondary structure feature of the protein was determined by circular dichroism (CD) technique. The possible functions of this protein were annotated by bioinformatics methods, and its possible three-dimensional model was constructed by molecular modeling. RESULTS: The pure sample of SARS E protein was obtained. The secondary structure feature derived from CD determination is similar to that from the secondary structure prediction. Bioinformatics analysis indicated that the key residues of SARS E protein were much conserved compared to the E proteins of other coronaviruses. In particular, the primary amino acid sequence of SARS E protien is much more similar to that of murine hepatitis virus(MHV) and other mammal coronaviruses. The transmembrane (TM) segment of the SARS E protein is relatively more conserved in the whole protein than other regions. CONCLUSION: The success of expressing the SARS E protein is a good starting point for investigating the structure and functions of this protein and SARS coronavirus itself as well. The SARS E protein may fold in water solution in a similar way as it in membrane-water mixed environment. It is possible that β-sheet I of the SARS E protein interacts with the membrane surface via hydrogen bonding, this β-sheet may uncoil to a random structure in water solution.

  18. Identification of probable genomic packaging signal sequence from SARS—CoV genome by bioinformatics analysis

    Institute of Scientific and Technical Information of China (English)

    QINLei; XIONGBin; LUOCheng; GUOZong-Ming; HAOPei; SUJiong; NANPeng; FENGYing; SHIYi-Xiang; YUXiao-Jing; LUOXiao-Min; CHENKai-Xian; SHENXu; SHENJian-Hua; ZOUJian-Ping; ZHAOGuo-Ping; SHITie-Liu; HEWei-Zhong; ZHONGYang; JIANGHua-Liang; LIYi-Xue

    2003-01-01

    AIM:To predict the probable genomic packaging signal of SARS-CoV by bioinformatics analysis. The derived packaging signal may be used to design antisense RNA and RNA interfere (RANi) drugs treating SARS. methods: Based on the studies about the genomic packaging signals of MHV and BCoV, especially the information about primary and secondary structures, the putative genomic packaging signal of SARS_CoV were analyzed by using bioinformatic tools. Multi-alignment for the genomic sequences was performed among SARS-CoV,MHV,BCoV, PEDV and HCoV 229E. Secondary structures of RNA sequences were also predicted for the identification fo the possible genomic packaging signals. Meanwhile, the N and M proteins of all five viruses were analyzed to study the evolutionary relationship with genomic packaging signals. RESULTS: The putative genomic packaging signal of SARS-CoV locates at the 3′ end of ORF1b near that of MHV and BCoV, where is the most variable region of this gene. The RNA secondary structure of SARS-CoV genomic packaging signal is very similar to that of MHV and BCoV. The same result was also obtained in studying the genomic packaging signals of PEDV and HCoV 229E. Further more, the genomic sequence multi-alignment indicated that the locations of packaging signals of SARS-CoV, PEDV, and HCoV overlaped each other. It seems that the mutation rate of packaging signal sequences is much higher than the N protein, while only subtle variations for the M protein. CONCLUSIONS: The probable genomic packaging signal of SARS-CoV is analogous to that of MHV and BCoV, with the corresponding secondary RNA structure locating at the similar region of ORF1b. The positions where genomic packaging signals exist have suffered rounds of mutations, which may influence the primary structures of the N and M proteins consequently.

  19. BATMAN-TCM: a Bioinformatics Analysis Tool for Molecular mechANism of Traditional Chinese Medicine

    Science.gov (United States)

    Liu, Zhongyang; Guo, Feifei; Wang, Yong; Li, Chun; Zhang, Xinlei; Li, Honglei; Diao, Lihong; Gu, Jiangyong; Wang, Wei; Li, Dong; He, Fuchu

    2016-02-01

    Traditional Chinese Medicine (TCM), with a history of thousands of years of clinical practice, is gaining more and more attention and application worldwide. And TCM-based new drug development, especially for the treatment of complex diseases is promising. However, owing to the TCM’s diverse ingredients and their complex interaction with human body, it is still quite difficult to uncover its molecular mechanism, which greatly hinders the TCM modernization and internationalization. Here we developed the first online Bioinformatics Analysis Tool for Molecular mechANism of TCM (BATMAN-TCM). Its main functions include 1) TCM ingredients’ target prediction; 2) functional analyses of targets including biological pathway, Gene Ontology functional term and disease enrichment analyses; 3) the visualization of ingredient-target-pathway/disease association network and KEGG biological pathway with highlighted targets; 4) comparison analysis of multiple TCMs. Finally, we applied BATMAN-TCM to Qishen Yiqi dripping Pill (QSYQ) and combined with subsequent experimental validation to reveal the functions of renin-angiotensin system responsible for QSYQ’s cardioprotective effects for the first time. BATMAN-TCM will contribute to the understanding of the “multi-component, multi-target and multi-pathway” combinational therapeutic mechanism of TCM, and provide valuable clues for subsequent experimental validation, accelerating the elucidation of TCM’s molecular mechanism. BATMAN-TCM is available at http://bionet.ncpsb.org/batman-tcm.

  20. BATMAN-TCM: a Bioinformatics Analysis Tool for Molecular mechANism of Traditional Chinese Medicine.

    Science.gov (United States)

    Liu, Zhongyang; Guo, Feifei; Wang, Yong; Li, Chun; Zhang, Xinlei; Li, Honglei; Diao, Lihong; Gu, Jiangyong; Wang, Wei; Li, Dong; He, Fuchu

    2016-01-01

    Traditional Chinese Medicine (TCM), with a history of thousands of years of clinical practice, is gaining more and more attention and application worldwide. And TCM-based new drug development, especially for the treatment of complex diseases is promising. However, owing to the TCM's diverse ingredients and their complex interaction with human body, it is still quite difficult to uncover its molecular mechanism, which greatly hinders the TCM modernization and internationalization. Here we developed the first online Bioinformatics Analysis Tool for Molecular mechANism of TCM (BATMAN-TCM). Its main functions include 1) TCM ingredients' target prediction; 2) functional analyses of targets including biological pathway, Gene Ontology functional term and disease enrichment analyses; 3) the visualization of ingredient-target-pathway/disease association network and KEGG biological pathway with highlighted targets; 4) comparison analysis of multiple TCMs. Finally, we applied BATMAN-TCM to Qishen Yiqi dripping Pill (QSYQ) and combined with subsequent experimental validation to reveal the functions of renin-angiotensin system responsible for QSYQ's cardioprotective effects for the first time. BATMAN-TCM will contribute to the understanding of the "multi-component, multi-target and multi-pathway" combinational therapeutic mechanism of TCM, and provide valuable clues for subsequent experimental validation, accelerating the elucidation of TCM's molecular mechanism. BATMAN-TCM is available at http://bionet.ncpsb.org/batman-tcm. PMID:26879404

  1. BATMAN-TCM: a Bioinformatics Analysis Tool for Molecular mechANism of Traditional Chinese Medicine

    Science.gov (United States)

    Liu, Zhongyang; Guo, Feifei; Wang, Yong; Li, Chun; Zhang, Xinlei; Li, Honglei; Diao, Lihong; Gu, Jiangyong; Wang, Wei; Li, Dong; He, Fuchu

    2016-01-01

    Traditional Chinese Medicine (TCM), with a history of thousands of years of clinical practice, is gaining more and more attention and application worldwide. And TCM-based new drug development, especially for the treatment of complex diseases is promising. However, owing to the TCM’s diverse ingredients and their complex interaction with human body, it is still quite difficult to uncover its molecular mechanism, which greatly hinders the TCM modernization and internationalization. Here we developed the first online Bioinformatics Analysis Tool for Molecular mechANism of TCM (BATMAN-TCM). Its main functions include 1) TCM ingredients’ target prediction; 2) functional analyses of targets including biological pathway, Gene Ontology functional term and disease enrichment analyses; 3) the visualization of ingredient-target-pathway/disease association network and KEGG biological pathway with highlighted targets; 4) comparison analysis of multiple TCMs. Finally, we applied BATMAN-TCM to Qishen Yiqi dripping Pill (QSYQ) and combined with subsequent experimental validation to reveal the functions of renin-angiotensin system responsible for QSYQ’s cardioprotective effects for the first time. BATMAN-TCM will contribute to the understanding of the “multi-component, multi-target and multi-pathway” combinational therapeutic mechanism of TCM, and provide valuable clues for subsequent experimental validation, accelerating the elucidation of TCM’s molecular mechanism. BATMAN-TCM is available at http://bionet.ncpsb.org/batman-tcm. PMID:26879404

  2. Bioinformatic methods in protein characterization

    OpenAIRE

    Kallberg, Yvonne

    2002-01-01

    Bioinformatics is an emerging interdisciplinary research field in which mathematics. computer science and biology meet. In this thesis. bioinformatic methods for analysis of functional and structural properties among proteins will be presented. I have developed and applied bioinformatic methods on the enzyme superfamily of short-chain dehydrogenases/reductases (SDRs), coenzyme-binding enzymes of the Rossmann fold type, and amyloid-forming proteins and peptides. The basis...

  3. Prokaryotic Expression of Rice Ospgip1 Gene and Bioinformatic Analysis of Encoded Product

    Institute of Scientific and Technical Information of China (English)

    CHEN Xi-jun; LIU Xiao-wei; Zuo Si-min; MA Yu-yin; TONG Yun-hui; PAN Xue-biao; XU Jing-you

    2011-01-01

    Using the reference sequences of pgip genes in GenBank,a fragment of 930 bp covering the open reading frame (ORF) of rice Ospgip1 (Oryza sativa polygalacturonase-inhibiting protein 1) was amplified.The prokaryotic expression product of the gene inhibited the growth of Rhizoctonia solani,the causal agent of rice sheath blight,and reduced its polygalacturonase activity.Bioinformatic analysis showed that OsPGIP1 is a hydrophobic protein with a molecular weight of 32.8 kDa and an isoelectric point (pl) of 7.26.The protein is mainly located in the cell wall of rice,and its signal peptide cleavage site is located between the 17th and 18th amino acids.There are four cysteines in both the N-and C-termini of the deduced protein,which can form three disulfide bonds (between the 56th and 63rd,the 278th and 298th,and the 300th and308th amino acids).The protein has a typical leucine-rich repeat (LRR) domain,and its secondary structure comprises α-helices,β-sheets and irregular coils.Compared with polygalacturonase-inhibiting proteins (PGIPs) from other plants,the 7th LRR is absent in OsPGIP1.The nine LRRs could form a cleft that might associate with proteins from pathogenic fungi,such as polygalacturonase.

  4. Effect of Wnt3a on Keratinocytes Utilizing in Vitro and Bioinformatics Analysis

    Directory of Open Access Journals (Sweden)

    Ju-Suk Nam

    2014-03-01

    Full Text Available Wingless-type (Wnt signaling proteins participate in various cell developmental processes. A suppressive role of Wnt5a on keratinocyte growth has already been observed. However, the role of other Wnt proteins in proliferation and differentiation of keratinocytes remains unknown. Here, we investigated the effects of the Wnt ligand, Wnt3a, on proliferation and differentiation of keratinocytes. Keratinocytes from normal human skin were cultured and treated with recombinant Wnt3a alone or in combination with the inflammatory cytokine, tumor necrosis factor α (TNFα. Furthermore, using bioinformatics, we analyzed the biochemical parameters, molecular evolution, and protein–protein interaction network for the Wnt family. Application of recombinant Wnt3a showed an anti-proliferative effect on keratinocytes in a dose-dependent manner. After treatment with TNFα, Wnt3a still demonstrated an anti-proliferative effect on human keratinocytes. Exogenous treatment of Wnt3a was unable to alter mRNA expression of differentiation markers of keratinocytes, whereas an altered expression was observed in TNFα-stimulated keratinocytes. In silico phylogenetic, biochemical, and protein–protein interaction analysis showed several close relationships among the family members of the Wnt family. Moreover, a close phylogenetic and biochemical similarity was observed between Wnt3a and Wnt5a. Finally, we proposed a hypothetical mechanism to illustrate how the Wnt3a protein may inhibit the process of proliferation in keratinocytes, which would be useful for future researchers.

  5. Bioinformatic analysis for structure and function of TCTP from Spirometra mansoni

    Institute of Scientific and Technical Information of China (English)

    Ya-Jun Lu; Gang Lu; Da-Zhong Shi; Li-Hua Li; Sai-Feng Zhong

    2013-01-01

    Objective:To predict structure and function of translationally controlled tumor protein (TCTP) from Spirometra mansoni by bioinformatics technology, and to provide a theoretical basis for further study. Methods: Open reading frame (ORF) of EST sequence from Spirometra mansoni was obtained by ORF finder and was translated into amino acid residue by DNAclub. The structure domain was analyzed by Blast. By the method of online analysis tools: Protparam, InterProScan, protscale, SignalP-3.0, PSORTⅡ, BepiPred, TMHMM, VectorNTI Suite 9 packages and Phyre2, the structure and function of the protein were predicted and analyzed. Results:The results showed that the EST sequence was Sm TCTP with 173 amino acid residues, theoretical molecular weight was 19 872.0 Da. The protein has the closest evolutionary status with Clonorchis sinensis, Schistosoma mansoni, and Schistosoma japonicum. Then it had no signal peptide site and transmembrane domain. Secondary structure of TCTP contained twoα-helices and eightβ-strands. Conclusions:Sm TCTP was a variety of biological functions of protein that may be used as a vaccine candidate molecule and drug target.

  6. Bioinformatics approaches for structural and functional analysis of proteins in secondary metabolism in Withania somnifera.

    Science.gov (United States)

    Sanchita; Singh, Swati; Sharma, Ashok

    2014-11-01

    Withania somnifera (Ashwagandha) is an affluent storehouse of large number of pharmacologically active secondary metabolites known as withanolides. These secondary metabolites are produced by withanolide biosynthetic pathway. Very less information is available on structural and functional aspects of enzymes involved in withanolides biosynthetic pathways of Withiana somnifera. We therefore performed a bioinformatics analysis to look at functional and structural properties of these important enzymes. The pathway enzymes taken for this study were 3-Hydroxy-3-methylglutaryl coenzyme A reductase, 1-Deoxy-D-xylulose-5-phosphate synthase, 1-Deoxy-D-xylulose-5-phosphate reductase, farnesyl pyrophosphate synthase, squalene synthase, squalene epoxidase, and cycloartenol synthase. The prediction of secondary structure was performed for basic structural information. Three-dimensional structures for these enzymes were predicted. The physico-chemical properties such as pI, AI, GRAVY and instability index were also studied. The current information will provide a platform to know the structural attributes responsible for the function of these protein until experimental structures become available.

  7. Integration and bioinformatics analysis of DNA-methylated genes associated with drug resistance in ovarian cancer

    Science.gov (United States)

    YAN, BINGBING; YIN, FUQIANG; WANG, QI; ZHANG, WEI; LI, LI

    2016-01-01

    The main obstacle to the successful treatment of ovarian cancer is the development of drug resistance to combined chemotherapy. Among all the factors associated with drug resistance, DNA methylation apparently plays a critical role. In this study, we performed an integrative analysis of the 26 DNA-methylated genes associated with drug resistance in ovarian cancer, and the genes were further evaluated by comprehensive bioinformatics analysis including gene/protein interaction, biological process enrichment and annotation. The results from the protein interaction analyses revealed that at least 20 of these 26 methylated genes are present in the protein interaction network, indicating that they interact with each other, have a correlation in function, and may participate as a whole in the regulation of ovarian cancer drug resistance. There is a direct interaction between the phosphatase and tensin homolog (PTEN) gene and at least half of the other genes, indicating that PTEN may possess core regulatory functions among these genes. Biological process enrichment and annotation demonstrated that most of these methylated genes were significantly associated with apoptosis, which is possibly an essential way for these genes to be involved in the regulation of multidrug resistance in ovarian cancer. In addition, a comprehensive analysis of clinical factors revealed that the methylation level of genes that are associated with the regulation of drug resistance in ovarian cancer was significantly correlated with the prognosis of ovarian cancer. Overall, this study preliminarily explains the potential correlation between the genes with DNA methylation and drug resistance in ovarian cancer. This finding has significance for our understanding of the regulation of resistant ovarian cancer by methylated genes, the treatment of ovarian cancer, and improvement of the prognosis of ovarian cancer. PMID:27347118

  8. AbMiner: A bioinformatic resource on available monoclonal antibodies and corresponding gene identifiers for genomic, proteomic, and immunologic studies

    Directory of Open Access Journals (Sweden)

    Shankavaram Uma

    2006-04-01

    Full Text Available Abstract Background Monoclonal antibodies are used extensively throughout the biomedical sciences for detection of antigens, either in vitro or in vivo. We, for example, have used them for quantitation of proteins on "reverse-phase" protein lysate arrays. For those studies, we quality-controlled > 600 available monoclonal antibodies and also needed to develop precise information on the genes that encode their antigens. Translation among the various protein and gene identifier types proved non-trivial because of one-to-many and many-to-one relationships. To organize the antibody, protein, and gene information, we initially developed a relational database in Filemaker for our own use. When it became apparent that the information would be useful to many other researchers faced with the need to choose or characterize antibodies, we developed it further as AbMiner, a fully relational web-based database under MySQL, programmed in Java. Description AbMiner is a user-friendly, web-based relational database of information on > 600 commercially available antibodies that we validated by Western blot for protein microarray studies. It includes many types of information on the antibody, the immunogen, the vendor, the antigen, and the antigen's gene. Multiple gene and protein identifier types provide links to corresponding entries in a variety of other public databases, including resources for phosphorylation-specific antibodies. AbMiner also includes our quality-control data against a pool of 60 diverse cancer cell types (the NCI-60 and also protein expression levels for the NCI-60 cells measured using our high-density "reverse-phase" protein lysate microarrays for a selection of the listed antibodies. Some other available database resources give information on antibody specificity for one or a couple of cell types. In contrast, the data in AbMiner indicate specificity with respect to the antigens in a pool of 60 diverse cell types from nine different

  9. Bioinformatics data supporting revelatory diversity of cultivable thermophiles isolated and identified from two terrestrial hot springs, Unkeshwar, India

    Directory of Open Access Journals (Sweden)

    Bhagwan N. Rekadwad

    2016-06-01

    Full Text Available A total of 21 thermophilic bacteria were isolated and identified using 16S rRNA gene sequencing method. Sequences were submitted to NCBI website. Short DNA sequences JN392966–JN392972; KC120909–KC120919; KM998072–KM998074 and KP053645 strains were downloaded from NCBI BioSample database. ENDMEMO GC calculating tool was used for calculation of maximum, minimum and average GC percentage and graphical representation of GC content. Data generated indicate 20 short DNA sequences have maximum GC content ranged from 60% to 100% with an average GC content 52.5–59.8%. It is recorded that Bacillus sp. W7, Escherichia coli strain NW1 and Geobacillus thermoleovorans strain rekadwadsis strains showed GC content maximum up to 70%; Actinobacterium EF_NAK1-7 up to 85.7%, while Bacillus megaterium and E. coli strain NW2 showed GC content maximum to 100%. Digital data on thermophilic bacteria isolated from Unkeshwar hot springs would be useful for interpretation of presence of biodiversity in addition to phenotypic, physiological characteristics and data generated through 16S rRNA gene sequencing technology.

  10. Bioinformatics data supporting revelatory diversity of cultivable thermophiles isolated and identified from two terrestrial hot springs, Unkeshwar, India.

    Science.gov (United States)

    Rekadwad, Bhagwan N; Khobragade, Chandrahasya N

    2016-06-01

    A total of 21 thermophilic bacteria were isolated and identified using 16S rRNA gene sequencing method. Sequences were submitted to NCBI website. Short DNA sequences JN392966-JN392972; KC120909-KC120919; KM998072-KM998074 and KP053645 strains were downloaded from NCBI BioSample database. ENDMEMO GC calculating tool was used for calculation of maximum, minimum and average GC percentage and graphical representation of GC content. Data generated indicate 20 short DNA sequences have maximum GC content ranged from 60% to 100% with an average GC content 52.5-59.8%. It is recorded that Bacillus sp. W7, Escherichia coli strain NW1 and Geobacillus thermoleovorans strain rekadwadsis strains showed GC content maximum up to 70%; Actinobacterium EF_NAK1-7 up to 85.7%, while Bacillus megaterium and E. coli strain NW2 showed GC content maximum to 100%. Digital data on thermophilic bacteria isolated from Unkeshwar hot springs would be useful for interpretation of presence of biodiversity in addition to phenotypic, physiological characteristics and data generated through 16S rRNA gene sequencing technology.

  11. Analysis of Metagenomics Next Generation Sequence Data for Fungal ITS Barcoding: Do You Need Advance Bioinformatics Experience?

    Science.gov (United States)

    Ahmed, Abdalla

    2016-01-01

    During the last few decades, most of microbiology laboratories have become familiar in analyzing Sanger sequence data for ITS barcoding. However, with the availability of next-generation sequencing platforms in many centers, it has become important for medical mycologists to know how to make sense of the massive sequence data generated by these new sequencing technologies. In many reference laboratories, the analysis of such data is not a big deal, since suitable IT infrastructure and well-trained bioinformatics scientists are always available. However, in small research laboratories and clinical microbiology laboratories the availability of such resources are always lacking. In this report, simple and user-friendly bioinformatics work-flow is suggested for fast and reproducible ITS barcoding of fungi. PMID:27507959

  12. Analysis of metagenomics next generation sequence data for fungal ITS barcoding: Do you need advance bioinformatics experience?

    Directory of Open Access Journals (Sweden)

    Abdalla Osman Abdalla Ahmed

    2016-07-01

    Full Text Available During the last few decades, most of microbiology laboratories have become familiar in analyzing Sanger sequence data for ITS barcoding. However, with the availability of next-generation sequencing platforms in many centers, it has become important for medical mycologists to know how to make sense of the massive sequence data generated by these new sequencing technologies. In many reference laboratories, the analysis of such data is not a big deal, since suitable IT infrastructure and well-trained bioinformatics scientists are always available. However, in small research laboratories and clinical microbiology laboratories the availability of such resources are always lacking. In this report, simple and user-friendly bioinformatics work-flow is suggested for fast and reproducible ITS barcoding of fungi.

  13. Analysis of Metagenomics Next Generation Sequence Data for Fungal ITS Barcoding: Do You Need Advance Bioinformatics Experience?

    Science.gov (United States)

    Ahmed, Abdalla

    2016-01-01

    During the last few decades, most of microbiology laboratories have become familiar in analyzing Sanger sequence data for ITS barcoding. However, with the availability of next-generation sequencing platforms in many centers, it has become important for medical mycologists to know how to make sense of the massive sequence data generated by these new sequencing technologies. In many reference laboratories, the analysis of such data is not a big deal, since suitable IT infrastructure and well-trained bioinformatics scientists are always available. However, in small research laboratories and clinical microbiology laboratories the availability of such resources are always lacking. In this report, simple and user-friendly bioinformatics work-flow is suggested for fast and reproducible ITS barcoding of fungi.

  14. Bioinformatic analysis of pathogenic missense mutations of activin receptor like kinase 1 ectodomain.

    Directory of Open Access Journals (Sweden)

    Claudia Scotti

    Full Text Available Activin A receptor, type II-like kinase 1 (also called ALK1, is a serine-threonine kinase predominantly expressed on endothelial cells surface. Mutations in its ACVRL1 encoding gene (12q11-14 cause type 2 Hereditary Haemorrhagic Telangiectasia (HHT2, an autosomal dominant multisystem vascular dysplasia. The study of the structural effects of mutations is crucial to understand their pathogenic mechanism. However, while an X-ray structure of ALK1 intracellular domain has recently become available (PDB ID: 3MY0, structure determination of ALK1 ectodomain (ALK1(EC has been elusive so far. We here describe the building of a homology model for ALK1(EC, followed by an extensive bioinformatic analysis, based on a set of 38 methods, of the effect of missense mutations at the sequence and structural level. ALK1(EC potential interaction mode with its ligand BMP9 was then predicted combining modelling and docking data. The calculated model of the ALK1(EC allowed mapping and a preliminary characterization of HHT2 associated mutations. Major structural changes and loss of stability of the protein were predicted for several mutations, while others were found to interfere mainly with binding to BMP9 or other interactors, like Endoglin (CD105, whose encoding ENG gene (9q34 mutations are known to cause type 1 HHT. This study gives a preliminary insight into the potential structure of ALK1(EC and into the structural effects of HHT2 associated mutations, which can be useful to predict the potential effect of each single mutation, to devise new biological experiments and to interpret the biological significance of new mutations, private mutations, or non-synonymous polymorphisms.

  15. Bioinformatic analysis of functional differences between the immunoproteasome and the constitutive proteasome

    DEFF Research Database (Denmark)

    Kesmir, Can; van Noort, V.; de Boer, R.J.;

    2003-01-01

    not yet been quantified how different the specificity of two forms of the proteasome are. The main question, which still lacks direct evidence, is whether the immunoproteasome generates more MHC ligands. Here we use bioinformatics tools to quantify these differences and show that the immunoproteasome...

  16. A Critical Analysis of Assessment Quality in Genomics and Bioinformatics Education Research

    Science.gov (United States)

    Campbell, Chad E.; Nehm, Ross H.

    2013-01-01

    The growing importance of genomics and bioinformatics methods and paradigms in biology has been accompanied by an explosion of new curricula and pedagogies. An important question to ask about these educational innovations is whether they are having a meaningful impact on students' knowledge, attitudes, or skills. Although assessments are…

  17. BIOINFORMATICS AND BIOSYNTHESIS ANALYSIS OF CELLULOSE SYNTHASE OPERON IN ZYMOMONAS MOBILIS ZM4

    Directory of Open Access Journals (Sweden)

    Sheik Abdul Kader Sheik Asraf, K. Narayanan Rajnish, and Paramasamy Gunasekaran

    2011-03-01

    confirmed by the Acetic-Nitric (Updegraff Cellulose assay. The Bioinformatics and biosynthetic analysis confirm the biosynthesis of cellulose in Z. mobilis.

  18. Identification of Immunoreactive Leishmania infantum Protein Antigens to Asymptomatic Dog Sera through Combined Immunoproteomics and Bioinformatics Analysis

    Science.gov (United States)

    Samiotaki, Martina; Panayotou, George; Karagouni, Evdokia

    2016-01-01

    Leishmania infantum is the etiologic agent of zoonotic visceral leishmaniasis (VL) in countries in the Mediterranean basin, where dogs are the domestic reservoirs and represent important elements in the transmission of the disease. Since the major focal areas of human VL exhibit a high prevalence of seropositive dogs, the control of canine VL could reduce the infection rate in humans. Efforts toward this have focused on the improvement of diagnostic tools, as well as on vaccine development. The identification of parasite antigens including suitable major histocompatibility complex (MHC) class I- and/or II-restricted epitopes is very important since disease protection is characterized by strong and long-lasting CD8+ T and CD4+ Th1 cell-dominated immunity. In the present study, total protein extract from late-log phase L. infantum promastigotes was analyzed by two-dimensional western blots and probed with sera from asymptomatic and symptomatic dogs. A total of 42 protein spots were found to differentially react with IgG from asymptomatic dogs, while 17 of these identified by Coommasie stain were extracted and analyzed. Of these, 21 proteins were identified by mass spectrometry; they were mainly involved in metabolism and stress responses. An in silico analysis predicted that the chaperonin HSP60, dihydrolipoamide dehydrogenase, enolase, cyclophilin 2, cyclophilin 40, and one hypothetical protein contain promiscuous MHCI and/or MHCII epitopes. Our results suggest that the combination of immunoproteomics and bioinformatics analyses is a promising method for the identification of novel candidate antigens for vaccine development or with potential use in the development of sensitive diagnostic tests. PMID:26906226

  19. Identification and bioinformatics analysis of lactate dehydrogenase genes fromEchinococcus granulosus

    Institute of Scientific and Technical Information of China (English)

    Gang Lu; Yajun Lu; Lihua Li; Lixian Wu; Zhigang Fan; Dazhong Shi; Hu Wang; Xiumin Han

    2010-01-01

    Objective:To identify full length cDNA sequence of lactate dehydrogenase(LDH) from adultEchinococcus granulosus (E. granulosus) and to predict the structure and function of its encoding protein using bioinformatics methods.Methods: With the help ofNCBI, EMBI, Expasy and other online sites, the open reading frame (ORF), conserved domain, physical and chemical parameters, signal peptide, epitope, topological structures of the protein sequences were predicted and a homology tertiary structure model was created; VectorNTI software was used for sequence alignment, phylogenetic tree construction and tertiary structure prediction. Results: The target sequence was1 233 bp length with a996 bp biggestORFencoding331 amino acids protein with typicalL-LDH conserved domain. It was confirmed as full length cDNA of LDH fromE. granulosus and named asEgLDH (GenBank accession number:HM748917). The predicted molecular weight and isoelectric point of the deduced protein were3 5516.2Da and6.32 respectively. Compared withLDHs fromTaenia solium, Taenia saginata asiatica, Spirometra erinaceieuropaei, Schistosoma japonicum, Clonorchis sinensis and human, it showed similarity of 86%, 85%, 55%, 58%, 58% and 53%, respectively. EgLDH contained3putative transmembrane regions and4 major epitopes (54aa-59aa,81aa-87aa,97aa-102aa,307aa-313aa), the latter were significant different from the corresponding regions of humanLDH. In addition, someNAD and substrate binding sites located on epitopes54aa-59aa and97aa-102aa, respectively. Tertiary structure prediction showed that3 key catalytic residues105R, 165D and192H forming a catalytic center near the epitope97aa-102aa, mostNAD and substrate binding sites located around the center.Conclusions: The full length cDNA sequences of EgLDH were identified. It encoded a putative transmembrane protein which might be an ideal target molecule for vaccine and drugs.

  20. SeqBuster, a bioinformatic tool for the processing and analysis of small RNAs datasets, reveals ubiquitous miRNA modifications in human embryonic cells.

    Science.gov (United States)

    Pantano, Lorena; Estivill, Xavier; Martí, Eulàlia

    2010-03-01

    High-throughput sequencing technologies enable direct approaches to catalog and analyze snapshots of the total small RNA content of living cells. Characterization of high-throughput sequencing data requires bioinformatic tools offering a wide perspective of the small RNA transcriptome. Here we present SeqBuster, a highly versatile and reliable web-based toolkit to process and analyze large-scale small RNA datasets. The high flexibility of this tool is illustrated by the multiple choices offered in the pre-analysis for mapping purposes and in the different analysis modules for data manipulation. To overcome the storage capacity limitations of the web-based tool, SeqBuster offers a stand-alone version that permits the annotation against any custom database. SeqBuster integrates multiple analyses modules in a unique platform and constitutes the first bioinformatic tool offering a deep characterization of miRNA variants (isomiRs). The application of SeqBuster to small-RNA datasets of human embryonic stem cells revealed that most miRNAs present different types of isomiRs, some of them being associated to stem cell differentiation. The exhaustive description of the isomiRs provided by SeqBuster could help to identify miRNA-variants that are relevant in physiological and pathological processes. SeqBuster is available at http://estivill_lab.crg.es/seqbuster. PMID:20008100

  1. New bioinformatic tools for analysis of nucleotide modifications in eukaryotic rRNA

    OpenAIRE

    Piekna-Przybylska, Dorota; Decatur, Wayne A.; Fournier, Maurille J.

    2007-01-01

    This report presents a valuable new bioinformatics package for research on rRNA nucleotide modifications in the ribosome, especially those created by small nucleolar RNA:protein complexes (snoRNPs). The interactive service, which is not available elsewhere, enables a user to visualize the positions of pseudouridines, 2′-O-methylations, and base methylations in three-dimensional space in the ribosome and also in linear and secondary structure formats of ribosomal RNA. Our tools provide additio...

  2. Bioinformatic Analysis for the Validation of Novel Biomarkers for Cancer Diagnosis and Drug Sensitivity

    OpenAIRE

    Lockwood, Laura Anne Rebecca

    2015-01-01

    Background: The genetic control of tumour progression presents the opportunity for bioinformatics and gene expression data to be used as a basis for tumour grading. The development of a genetic signature based on microarray data allows for the development of personalised chemotherapeutic regimes. Method: ONCOMINE was utilised to create a genetic signature for ovarian serous adenocarcinoma and to compare the expression of genes between normal ovarian and cancerous cells. Ingenuity Pathways...

  3. Advantages and disadvantages in usage of bioinformatic programs in promoter region analysis

    Science.gov (United States)

    Pawełkowicz, Magdalena E.; Skarzyńska, Agnieszka; Posyniak, Kacper; ZiÄ bska, Karolina; PlÄ der, Wojciech; Przybecki, Zbigniew

    2015-09-01

    An important computational challenge is finding the regulatory elements across the promotor region. In this work we present the advantages and disadvantages from the application of different bioinformatics programs for localization of transcription factor binding sites in the upstream region of genes connected with sex determination in cucumber. We use PlantCARE, PlantPAN and SignalScan to find motifs in the promotor regions. The results have been compared and possible function of chosen motifs has been described.

  4. Bioinformatics analysis of the factors controlling type I IFN gene expression in autoimmune disease and virus-induced immunity

    Directory of Open Access Journals (Sweden)

    Di eFeng

    2013-09-01

    Full Text Available Patients with systemic lupus erythematosus (SLE and Sjögren's syndrome (SS display increased levels of type I IFN-induced genes. Plasmacytoid dendritic cells (PDCs are natural interferon producing cells and considered to be a primary source of IFN-α in these two diseases. Differential expression patterns of type I IFN inducible transcripts can be found in different immune cell subsets and in patients with both active and inactive autoimmune disease. A type I IFN gene signature generally consists of three groups of IFN-induced genes - those regulated in response to virus-induced type I IFN, those regulated by the IFN-induced mitogen-activated protein kinase/extracellular-regulated kinase (MAPK/ERK pathway, and those by the IFN-induced phosphoinositide-3 kinase (PI-3K pathway. These three groups of type I IFN-regulated genes control important cellular processes such as apoptosis, survival, adhesion, and chemotaxis, that when dysregulated, contribute to autoimmunity. With the recent generation of large datasets in the public domain from next-generation sequencing and DNA microarray experiments, one can perform detailed analyses of cell type-specific gene signatures as well as identify distinct transcription factors that differentially regulate these gene signatures. We have performed bioinformatics analysis of data in the public domain and experimental data from our lab to gain insight into the regulation of type I IFN gene expression. We have found that the genetic landscape of the IFNA and IFNB genes are occupied by transcription factors, such as insulators CTCF and cohesin, that negatively regulate transcription, as well as IRF5 and IRF7, that positively and distinctly regulate IFNA subtypes. A detailed understanding of the factors controlling type I IFN gene transcription will significantly aid in the identification and development of new therapeutic strategies targeting the IFN pathway in autoimmune disease.

  5. Cloning and Bioinformatics Analysis of ZmERECTA-LIKE1 and Construction of Plant Expression Vector

    Institute of Scientific and Technical Information of China (English)

    Yihong JI; Jinbao PAN; Min LU; Jun HAN; Zhangjie NAN; Qingpeng SUN

    2016-01-01

    Objective] This study was conducted to clone and analyze ERECTA-LIKE1 gene in Zea mays by PCR and bioinfor-matics methods and to construct plant expression vector pCambia3301-zmERECTA-LIKE1. [Method] zmERECTA-LIKE1 (zmERL1) gene was obtained using RT-PCR, and physical-chemical properties were analyzed by bioinformatics methods, including domains, transmembrane regions, N-Glycosylation potential sites phosphorylation sites, and etc. [Result] Bioinformatics results showed that zmERL1 gene was 2 169 bp, which encoded a protein consisting of 722 amino acids, 11 N-glycosylation potential sites and 42 kinase specific phosphorylation sites. According to CDD2.23 and TMHMM Server v. 2.0 software, there were leucine-rich repeats, a PKC domain and a transmembrane region in this protein. The theoretical pI and molecular weight of zmERL1 encoded protein was 6.20 and 79 184.8 using Compute PI/Mw tool. Furthermore, we constructed the plant expression vector pCambia3301-zmERECTA-LIKE1 by subcloning zmERL1 gene into pCambia3301 instead of GUS. [Conclusion] The results provide a theoretical basis for the application of zmERL1 gene in future study.

  6. A Comprehensive Bioinformatics Analysis of the Nudix Superfamily in Arabidopsis thaliana

    Directory of Open Access Journals (Sweden)

    D. Gunawardana

    2009-01-01

    Full Text Available Nudix enzymes are a superfamily with a conserved common reaction mechanism that provides the capacity for the hydrolysis of a broad spectrum of metabolites. We used hidden Markov models based on Nudix sequences from the PFAM and PROSITE databases to identify Nudix hydrolases encoded by the Arabidopsis genome. 25 Nudix hydrolases were identified and classified into 11 individual families by pairwise sequence alignments. Intron phases were strikingly conserved in each family. Phylogenetic analysis showed that all multimember families formed monophyletic clusters. Conserved familial sequence motifs were identified with the MEME motif analysis algorithm. One motif (motif 4 was found in three diverse families. All proteins containing motif 4 demonstrated a degree of preference for substrates containing an ADP moiety. We conclude that HMM model-based genome scanning and MEME motif analysis, respectively, can significantly improve the identification and assignment of function of new members of this mechanistically-diverse protein superfamily.

  7. Towards understanding the lifespan extension by reduced insulin signaling: bioinformatics analysis of DAF-16/FOXO direct targets in Caenorhabditis elegans

    Science.gov (United States)

    Li, Yan-Hui; Zhang, Gai-Gai

    2016-01-01

    DAF-16, the C. elegans FOXO transcription factor, is an important determinant in aging and longevity. In this work, we manually curated FOXODB http://lyh.pkmu.cn/foxodb/, a database of FOXO direct targets. It now covers 208 genes. Bioinformatics analysis on 109 DAF-16 direct targets in C. elegans found interesting results. (i) DAF-16 and transcription factor PQM-1 co-regulate some targets. (ii) Seventeen targets directly regulate lifespan. (iii) Four targets are involved in lifespan extension induced by dietary restriction. And (iv) DAF-16 direct targets might play global roles in lifespan regulation. PMID:27027346

  8. GProX, a User-Friendly Platform for Bioinformatics Analysis and Visualization of Quantitative Proteomics Data

    DEFF Research Database (Denmark)

    Rigbolt, Kristoffer T G; Vanselow, Jens T; Blagoev, Blagoy

    2011-01-01

    -friendly platform for comprehensive analysis, inspection and visualization of quantitative proteomics data we developed the Graphical Proteomics Data Explorer (GProX)(1). The program requires no special bioinformatics training, as all functions of GProX are accessible within its graphical user-friendly interface...... which will be intuitive to most users. Basic features facilitate the uncomplicated management and organization of large data sets and complex experimental setups as well as the inspection and graphical plotting of quantitative data. These are complemented by readily available high-level analysis options...... such as database querying, clustering based on abundance ratios, feature enrichment tests for e.g. GO terms and pathway analysis tools. A number of plotting options for visualization of quantitative proteomics data is available and most analysis functions in GProX create customizable high quality graphical...

  9. Bioinformatics for Exploration

    Science.gov (United States)

    Johnson, Kathy A.

    2006-01-01

    For the purpose of this paper, bioinformatics is defined as the application of computer technology to the management of biological information. It can be thought of as the science of developing computer databases and algorithms to facilitate and expedite biological research. This is a crosscutting capability that supports nearly all human health areas ranging from computational modeling, to pharmacodynamics research projects, to decision support systems within autonomous medical care. Bioinformatics serves to increase the efficiency and effectiveness of the life sciences research program. It provides data, information, and knowledge capture which further supports management of the bioastronautics research roadmap - identifying gaps that still remain and enabling the determination of which risks have been addressed.

  10. Feature selection in bioinformatics

    Science.gov (United States)

    Wang, Lipo

    2012-06-01

    In bioinformatics, there are often a large number of input features. For example, there are millions of single nucleotide polymorphisms (SNPs) that are genetic variations which determine the dierence between any two unrelated individuals. In microarrays, thousands of genes can be proled in each test. It is important to nd out which input features (e.g., SNPs or genes) are useful in classication of a certain group of people or diagnosis of a given disease. In this paper, we investigate some powerful feature selection techniques and apply them to problems in bioinformatics. We are able to identify a very small number of input features sucient for tasks at hand and we demonstrate this with some real-world data.

  11. CDH1/E-cadherin and solid tumors. An updated gene-disease association analysis using bioinformatics tools.

    Science.gov (United States)

    Abascal, María Florencia; Besso, María José; Rosso, Marina; Mencucci, María Victoria; Aparicio, Evangelina; Szapiro, Gala; Furlong, Laura Inés; Vazquez-Levin, Mónica Hebe

    2016-02-01

    Cancer is a group of diseases that causes millions of deaths worldwide. Among cancers, Solid Tumors (ST) stand-out due to their high incidence and mortality rates. Disruption of cell-cell adhesion is highly relevant during tumor progression. Epithelial-cadherin (protein: E-cadherin, gene: CDH1) is a key molecule in cell-cell adhesion and an abnormal expression or/and function(s) contributes to tumor progression and is altered in ST. A systematic study was carried out to gather and summarize current knowledge on CDH1/E-cadherin and ST using bioinformatics resources. The DisGeNET database was exploited to survey CDH1-associated diseases. Reported mutations in specific ST were obtained by interrogating COSMIC and IntOGen tools. CDH1 Single Nucleotide Polymorphisms (SNP) were retrieved from the dbSNP database. DisGeNET analysis identified 609 genes annotated to ST, among which CDH1 was listed. Using CDH1 as query term, 26 disease concepts were found, 21 of which were neoplasms-related terms. Using DisGeNET ALL Databases, 172 disease concepts were identified. Of those, 80 ST disease-related terms were subjected to manual curation and 75/80 (93.75%) associations were validated. On selected ST, 489 CDH1 somatic mutations were listed in COSMIC and IntOGen databases. Breast neoplasms had the highest CDH1-mutation rate. CDH1 was positioned among the 20 genes with highest mutation frequency and was confirmed as driver gene in breast cancer. Over 14,000 SNP for CDH1 were found in the dbSNP database. This report used DisGeNET to gather/compile current knowledge on gene-disease association for CDH1/E-cadherin and ST; data curation expanded the number of terms that relate them. An updated list of CDH1 somatic mutations was obtained with COSMIC and IntOGen databases and of SNP from dbSNP. This information can be used to further understand the role of CDH1/E-cadherin in health and disease.

  12. In the Spotlight: Bioinformatics

    OpenAIRE

    Wang, May Dongmei

    2012-01-01

    During 2012, next generation sequencing (NGS) has attracted great attention in the biomedical research community, especially for personalized medicine. Also, third generation sequencing has become available. Therefore, state-of-art sequencing technology and analysis are reviewed in this Bioinformatics spotlight on 2012. Next-generation sequencing (NGS) is high-throughput nucleic acid sequencing technology with wide dynamic range and single base resolution. The full promise of NGS depends on t...

  13. Design and bioinformatics analysis of novel biomimetic peptides as nanocarriers for gene transfer

    Directory of Open Access Journals (Sweden)

    Asia Majidi

    2015-01-01

    Full Text Available Objective(s: The introduction of nucleic acids into cells for therapeutic objectives is significantly hindered by the size and charge of these molecules and therefore requires efficient vectors that assist cellular uptake. For several years great efforts have been devoted to the study of development of recombinant vectors based on biological domains with potential applications in gene therapy. Such vectors have been synthesized in genetically engineered approach, resulting in biomacromolecules with new properties that are not present in nature. Materials and Methods: In this study, we have designed new peptides using homology modeling with the purpose of overcoming the cell barriers for successful gene delivery through Bioinformatics tools. Three different carriers were designed and one of those with better score through Bioinformatics tools was cloned, expressed and its affinity for pDNA was monitored. Results: The resultszz demonstrated that the vector can effectively condense pDNAinto nanoparticles with the average sizes about 100 nm. Conclusion: We hope these peptides can overcome the biological barriers associated with gene transfer, and mediate efficient gene delivery.

  14. Hypothetical granulin-like molecule from Fasciola hepatica identified by bioinformatics analysis

    OpenAIRE

    Machicado, Claudia; Marcos, Luis A.; Zimic, Mirko

    2016-01-01

    Fasciola hepatica is considered an emergent human pathogen, causing liver fibrosis or cirrhosis, conditions that are known to be direct causes of cancer. Some parasites have been categorized by WHO as carcinogenic agents such as Opisthorchis viverrini, a relative of F. hepatica. Although these two parasites are from the same class (Trematoda), the role of F. hepatica in carcinogenesis is unclear. We hypothesized that F. hepatica might share some features with O. viverrini and to be responsibl...

  15. Bioinformatics Approaches for Human Gut Microbiome Research

    Directory of Open Access Journals (Sweden)

    Zhijun Zheng

    2016-07-01

    Full Text Available The human microbiome has received much attention because many studies have reported that the human gut microbiome is associated with several diseases. The very large datasets that are produced by these kinds of studies means that bioinformatics approaches are crucial for their analysis. Here, we systematically reviewed bioinformatics tools that are commonly used in microbiome research, including a typical pipeline and software for sequence alignment, abundance profiling, enterotype determination, taxonomic diversity, identifying differentially abundant species/genes, gene cataloging, and functional analyses. We also summarized the algorithms and methods used to define metagenomic species and co-abundance gene groups to expand our understanding of unclassified and poorly understood gut microbes that are undocumented in the current genome databases. Additionally, we examined the methods used to identify metagenomic biomarkers based on the gut microbiome, which might help to expand the knowledge and approaches for disease detection and monitoring.

  16. In the Spotlight: Bioinformatics

    Science.gov (United States)

    Wang, May Dongmei

    2016-01-01

    During 2012, next generation sequencing (NGS) has attracted great attention in the biomedical research community, especially for personalized medicine. Also, third generation sequencing has become available. Therefore, state-of-art sequencing technology and analysis are reviewed in this Bioinformatics spotlight on 2012. Next-generation sequencing (NGS) is high-throughput nucleic acid sequencing technology with wide dynamic range and single base resolution. The full promise of NGS depends on the optimization of NGS platforms, sequence alignment and assembly algorithms, data analytics, novel algorithms for integrating NGS data with existing genomic, proteomic, or metabolomic data, and quantitative assessment of NGS technology in comparing to more established technologies such as microarrays. NGS technology has been predicated to become a cornerstone of personalized medicine. It is argued that NGS is a promising field for motivated young researchers who are looking for opportunities in bioinformatics. PMID:23192635

  17. Analysis of RNAseq datasets from a comparative infectious disease zebrafish model using GeneTiles bioinformatics.

    Science.gov (United States)

    Veneman, Wouter J; de Sonneville, Jan; van der Kolk, Kees-Jan; Ordas, Anita; Al-Ars, Zaid; Meijer, Annemarie H; Spaink, Herman P

    2015-03-01

    We present a RNA deep sequencing (RNAseq) analysis of a comparison of the transcriptome responses to infection of zebrafish larvae with Staphylococcus epidermidis and Mycobacterium marinum bacteria. We show how our developed GeneTiles software can improve RNAseq analysis approaches by more confidently identifying a large set of markers upon infection with these bacteria. For analysis of RNAseq data currently, software programs such as Bowtie2 and Samtools are indispensable. However, these programs that are designed for a LINUX environment require some dedicated programming skills and have no options for visualisation of the resulting mapped sequence reads. Especially with large data sets, this makes the analysis time consuming and difficult for non-expert users. We have applied the GeneTiles software to the analysis of previously published and newly obtained RNAseq datasets of our zebrafish infection model, and we have shown the applicability of this approach also to published RNAseq datasets of other organisms by comparing our data with a published mammalian infection study. In addition, we have implemented the DEXSeq module in the GeneTiles software to identify genes, such as glucagon A, that are differentially spliced under infection conditions. In the analysis of our RNAseq data, this has led to the possibility to improve the size of data sets that could be efficiently compared without using problem-dedicated programs, leading to a quick identification of marker sets. Therefore, this approach will also be highly useful for transcriptome analyses of other organisms for which well-characterised genomes are available. PMID:25503064

  18. Phylogenetic trees in bioinformatics

    Energy Technology Data Exchange (ETDEWEB)

    Burr, Tom L [Los Alamos National Laboratory

    2008-01-01

    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding the best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.

  19. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software.

    Science.gov (United States)

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians.

  20. Clustering Techniques in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Muhammad Ali Masood

    2015-01-01

    Full Text Available Dealing with data means to group information into a set of categories either in order to learn new artifacts or understand new domains. For this purpose researchers have always looked for the hidden patterns in data that can be defined and compared with other known notions based on the similarity or dissimilarity of their attributes according to well-defined rules. Data mining, having the tools of data classification and data clustering, is one of the most powerful techniques to deal with data in such a manner that it can help researchers identify the required information. As a step forward to address this challenge, experts have utilized clustering techniques as a mean of exploring hidden structure and patterns in underlying data. Improved stability, robustness and accuracy of unsupervised data classification in many fields including pattern recognition, machine learning, information retrieval, image analysis and bioinformatics, clustering has proven itself as a reliable tool. To identify the clusters in datasets algorithm are utilized to partition data set into several groups based on the similarity within a group. There is no specific clustering algorithm, but various algorithms are utilized based on domain of data that constitutes a cluster and the level of efficiency required. Clustering techniques are categorized based upon different approaches. This paper is a survey of few clustering techniques out of many in data mining. For the purpose five of the most common clustering techniques out of many have been discussed. The clustering techniques which have been surveyed are: K-medoids, K-means, Fuzzy C-means, Density-Based Spatial Clustering of Applications with Noise (DBSCAN and Self-Organizing Map (SOM clustering.

  1. Identification of key pathways and genes in colorectal cancer using bioinformatics analysis.

    Science.gov (United States)

    Liang, Bin; Li, Chunning; Zhao, Jianying

    2016-10-01

    Colorectal cancer (CRC) is the most common malignant tumor of digestive system. The aim of this study was to identify gene signatures during CRC and uncover their potential mechanisms. The gene expression profiles of GSE21815 were downloaded from GEO database. The GSE21815 dataset contained 141 samples, including 132 CRC and 9 normal colon epitheliums. The gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) enrichment analyses were performed, and protein-protein interaction (PPI) network of the differentially expressed genes (DEGs) was constructed by Cytoscape software. In total, 3500 DEGs were identified in CRC, including 1370 up-regulated genes and 2130 down-regulated genes. GO analysis results showed that up-regulated DEGs were significantly enriched in biological processes (BP), including cell cycle, cell division, and cell proliferation; the down-regulated DEGs were significantly enriched in biological processes, including immune response, intracellular signaling cascade and defense response. KEGG pathway analysis showed the up-regulated DEGs were enriched in cell cycle and DNA replication, while the down-regulated DEGs were enriched in drug metabolism, metabolism of xenobiotics by cytochrome P450, and retinol metabolism pathways. The top 10 hub genes, GNG2, AGT, SAA1, ADCY5, LPAR1, NMU, IL8, CXCL12, GNAI1, and CCR2 were identified from the PPI network, and sub-networks revealed these genes were involved in significant pathways, including G protein-coupled receptors signaling pathway, gastrin-CREB signaling pathway via PKC and MAPK, and extracellular matrix organization. In conclusion, the present study indicated that the identified DEGs and hub genes promote our understanding of the molecular mechanisms underlying the development of CRC, and might be used as molecular targets and diagnostic biomarkers for the treatment of CRC. PMID:27581154

  2. Bioinformatics analysis of the early inflammatory response in a rat thermal injury model

    Directory of Open Access Journals (Sweden)

    Berthiaume Francois

    2007-01-01

    Full Text Available Abstract Background Thermal injury is among the most severe forms of trauma and its effects are both local and systemic. Response to thermal injury includes cellular protection mechanisms, inflammation, hypermetabolism, prolonged catabolism, organ dysfunction and immuno-suppression. It has been hypothesized that gene expression patterns in the liver will change with severe burns, thus reflecting the role the liver plays in the response to burn injury. Characterizing the molecular fingerprint (i.e., expression profile of the inflammatory response resulting from burns may help elucidate the activated mechanisms and suggest new therapeutic intervention. In this paper we propose a novel integrated framework for analyzing time-series transcriptional data, with emphasis on the burn-induced response within the context of the rat animal model. Our analysis robustly identifies critical expression motifs, indicative of the dynamic evolution of the inflammatory response and we further propose a putative reconstruction of the associated transcription factor activities. Results Implementation of our algorithm on data obtained from an animal (rat burn injury study identified 281 genes corresponding to 4 unique profiles. Enrichment evaluation upon both gene ontologies and transcription factors, verifies the inflammation-specific character of the selections and the rationalization of the burn-induced inflammatory response. Conducting the transcription network reconstruction and analysis, we have identified transcription factors, including AHR, Octamer Binding Proteins, Kruppel-like Factors, and cell cycle regulators as being highly important to an organism's response to burn response. These transcription factors are notable due to their roles in pathways that play a part in the gross physiological response to burn such as changes in the immune response and inflammation. Conclusion Our results indicate that our novel selection/classification algorithm has been

  3. Bioinformatics analysis of differentially expressed pathways related to the metastatic characteristics of osteosarcoma.

    Science.gov (United States)

    Sun, Wei; Ma, Xiaojun; Shen, Jiakang; Yin, Fei; Wang, Chongren; Cai, Zhengdong

    2016-08-01

    In this study, gene expression data of osteosarcoma (OSA) were analyzed to identify metastasis-related biological pathways. Four gene expression data sets (GSE21257, GSE9508, GSE49003 and GSE66673) were downloaded from Gene Expression Omnibus (GEO). An analysis of differentially expressed genes (DEGs) was performed using the Significance Analysis of Microarray (SAM) method. Gene expression levels were converted into scores of pathways by the Functional Analysis of Individual Microarray Expression (FAIME) algorithm and the differentially expressed pathways (DEPs) were then disclosed by a t-test. The distinguishing and prediction ability of the DEPs for metastatic and non-metastatic OSA was further confirmed using the principal component analysis (PCA) method and 3 gene expression data sets (GSE9508, GSE49003 and GSE66673) based on the support vector machines (SVM) model. A total of 616 downregulated and 681 upregulated genes were identified in the data set, GSE21257. The DEGs could not be used to distinguish metastatic OSA from non-metastatic OSA, as shown by PCA. Thus, an analysis of DEPs was further performed, resulting in 14 DEPs, such as NRAS signaling, Toll-like receptor (TLR) signaling, matrix metalloproteinase (MMP) regulation of cytokines and tumor necrosis factor receptor-associated factor (TRAF)-mediated interferon regulatory factor 7 (IRF7) activation. Cluster analysis indicated that these pathways could be used to distinguish between metastatic OSA from non-metastatic OSA. The prediction accuracy was 91, 66.7 and 87.5% for the data sets, GSE9508, GSE49003 and GSE66673, respectively. The results of PCA further validated that the DEPs could be used to distinguish metastatic OSA from non-metastatic OSA. On the whole, several DEPs were identified in metastatic OSA compared with non-metastatic OSA. Further studies on these pathways and relevant genes may help to enhance our understanding of the molecular mechanisms underlying metastasis

  4. Prenatal alcohol exposure alters gene expression in the rat brain: Experimental design and bioinformatic analysis of microarray data.

    Science.gov (United States)

    Lussier, Alexandre A; Stepien, Katarzyna A; Weinberg, Joanne; Kobor, Michael S

    2015-09-01

    We previously identified gene expression changes in the prefrontal cortex and hippocampus of rats prenatally exposed to alcohol under both steady-state and challenge conditions (Lussier et al., 2015, Alcohol.: Clin. Exp. Res., 39, 251-261). In this study, adult female rats from three prenatal treatment groups (ad libitum-fed control, pair-fed, and ethanol-fed) were injected with physiological saline solution or complete Freund׳s adjuvant (CFA) to induce arthritis (adjuvant-induced arthritis, AA). The prefrontal cortex and hippocampus were collected 16 days (peak of arthritis) or 39 days (during recovery) following injection, and whole genome gene expression was assayed using Illumina׳s RatRef-12 expression microarray. Here, we provide additional metadata, detailed explanations of data pre-processing steps and quality control, as well as a basic framework for the bioinformatic analyses performed. The datasets from this study are publicly available on the GEO repository (accession number GSE63561). PMID:26217797

  5. Novel C16orf57 mutations in patients with Poikiloderma with Neutropenia: bioinformatic analysis of the protein and predicted effects of all reported mutations

    Directory of Open Access Journals (Sweden)

    Colombo Elisa A

    2012-01-01

    Full Text Available Abstract Background Poikiloderma with Neutropenia (PN is a rare autosomal recessive genodermatosis caused by C16orf57 mutations. To date 17 mutations have been identified in 31 PN patients. Results We characterize six PN patients expanding the clinical phenotype of the syndrome and the mutational repertoire of the gene. We detect the two novel C16orf57 mutations, c.232C>T and c.265+2T>G, as well as the already reported c.179delC, c.531delA and c.693+1G>T mutations. cDNA analysis evidences the presence of aberrant transcripts, and bioinformatic prediction of C16orf57 protein structure gauges the mutations effects on the folded protein chain. Computational analysis of the C16orf57 protein shows two conserved H-X-S/T-X tetrapeptide motifs marking the active site of a two-fold pseudosymmetric structure recalling the 2H phosphoesterase superfamily. Based on this model C16orf57 is likely a 2H-active site enzyme functioning in RNA processing, as a presumptive RNA ligase. According to bioinformatic prediction, all known C16orf57 mutations, including the novel mutations herein described, impair the protein structure by either removing one or both tetrapeptide motifs or by destroying the symmetry of the native folding. Finally, we analyse the geographical distribution of the recurrent mutations that depicts clusters featuring a founder effect. Conclusions In cohorts of patients clinically affected by genodermatoses with overlapping symptoms, the molecular screening of C16orf57 gene seems the proper way to address the correct diagnosis of PN, enabling the syndrome-specific oncosurveillance. The bioinformatic prediction of the C16orf57 protein structure denotes a very basic enzymatic function consistent with a housekeeping function. Detection of aberrant transcripts, also in cells from PN patients carrying early truncated mutations, suggests they might be translatable. Tissue-specific sensitivity to the lack of functionally correct protein accounts for the

  6. The Cytotoxicity Mechanism of 6-Shogaol-Treated HeLa Human Cervical Cancer Cells Revealed by Label-Free Shotgun Proteomics and Bioinformatics Analysis

    Directory of Open Access Journals (Sweden)

    Qun Liu

    2012-01-01

    Full Text Available Cervical cancer is one of the most common cancers among women in the world. 6-Shogaol is a natural compound isolated from the rhizome of ginger (Zingiber officinale. In this paper, we demonstrated that 6-shogaol induced apoptosis and G2/M phase arrest in human cervical cancer HeLa cells. Endoplasmic reticulum stress and mitochondrial pathway were involved in 6-shogaol-mediated apoptosis. Proteomic analysis based on label-free strategy by liquid chromatography chip quadrupole time-of-flight mass spectrometry was subsequently proposed to identify, in a non-target-biased manner, the molecular changes in cellular proteins in response to 6-shogaol treatment. A total of 287 proteins were differentially expressed in response to 24 h treatment with 15 μM 6-shogaol in HeLa cells. Significantly changed proteins were subjected to functional pathway analysis by multiple analyzing software. Ingenuity pathway analysis (IPA suggested that 14-3-3 signaling is a predominant canonical pathway involved in networks which may be significantly associated with the process of apoptosis and G2/M cell cycle arrest induced by 6-shogaol. In conclusion, this work developed an unbiased protein analysis strategy by shotgun proteomics and bioinformatics analysis. Data observed provide a comprehensive analysis of the 6-shogaol-treated HeLa cell proteome and reveal protein alterations that are associated with its anticancer mechanism.

  7. Bioinformatics analysis of biomarkers and transcriptional factor motifs in Down syndrome

    Directory of Open Access Journals (Sweden)

    X.D. Kong

    2014-10-01

    Full Text Available In this study, biomarkers and transcriptional factor motifs were identified in order to investigate the etiology and phenotypic severity of Down syndrome. GSE 1281, GSE 1611, and GSE 5390 were downloaded from the gene expression ominibus (GEO. A robust multiarray analysis (RMA algorithm was applied to detect differentially expressed genes (DEGs. In order to screen for biological pathways and to interrogate the Kyoto Encyclopedia of Genes and Genomes (KEGG pathway database, the database for annotation, visualization, and integrated discovery (DAVID was used to carry out a gene ontology (GO function enrichment for DEGs. Finally, a transcriptional regulatory network was constructed, and a hypergeometric distribution test was applied to select for significantly enriched transcriptional factor motifs. CBR1, DYRK1A, HMGN1, ITSN1, RCAN1, SON, TMEM50B, and TTC3 were each up-regulated two-fold in Down syndrome samples compared to normal samples; of these, SON and TTC3 were newly reported. CBR1, DYRK1A, HMGN1, ITSN1, RCAN1, SON, TMEM50B, and TTC3 were located on human chromosome 21 (mouse chromosome 16. The DEGs were significantly enriched in macromolecular complex subunit organization and focal adhesion pathways. Eleven significantly enriched transcription factor motifs (PAX5, EGR1, XBP1, SREBP1, OLF1, MZF1, NFY, NFKAPPAB, MYCMAX, NFE2, and RP58 were identified. The DEGs and transcription factor motifs identified in our study provide biomarkers for the understanding of Down syndrome pathogenesis and progression.

  8. Bioinformatics analysis of human prohibitin%人抗增殖蛋白1生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    陈晨; 赵小峰

    2015-01-01

    Objective To perform the bioinformatics analysis for predicting the structure and function of human pro-hibitin 1(PHB1) to lay the foundation for its functional research and application. Methods The bioinformatics tools were used to predict the chromosome location,transmembrane region,spatial structure,physical and chemical properties and functional regions of PHB1. Results The bioinformatic analysis revealed that PHB1 was composed of 272 amino acids,in which the alanine content was highest;the theoretical isoelectric point was 5.57,the molecular formula was C1331H2154N370O400S2 with a relative molecular mass of 29 804.1. PHB1 protein was a non-transmembrane hydrophobin,which was constituted by alpha-helix. Conclusion Human PHB1 is a superfamily member of cellular membrane protein ,plays the corresponding biological function and also participate in the occurrence and development of many human diseases.%目的:对人抗增殖蛋白1(PHB1)进行生物信息学分析,预测其结构和功能,为人PHB1的功能研究和利用奠定基础。方法利用生物信息学工具对人PHB1的染色体定位、跨膜区域、空间结构、理化性质和功能区进行预测。结果人PHB1由272个氨基酸组成,其中丙氨酸含量最高。该蛋白等电点为5.57,相对分子质量为29804.1,分子式为C1331H2154N370O400S2。该蛋白为非跨膜的疏水蛋白,主要由α-螺旋构成。结论人PHB1为细胞膜蛋白超家族成员,发挥相应的生物学功能,也参与了人类许多疾病的发生、发展过程。

  9. Global secretome analysis identifies novel mediators of bone metastasis

    Institute of Scientific and Technical Information of China (English)

    Mario Andres Blanco; Gary LeRoy; Zia Khan; Ma(s)a Ale(c)kovi(c); Barry M Zee; Benjamin A Garcia; Yibin Kang

    2012-01-01

    Bone is the one of the most common sites of distant metastasis of solid tumors.Secreted proteins are known to influence pathological interactions between metastatic cancer cells and the bone stroma.To comprehensively profile secreted proteins associated with bone metastasis,we used quantitative and non-quantitative mass spectrometry to globally analyze the secretomes of nine cell lines of varying bone metastatic ability from multiple species and cancer types.By comparing the secretomes of parental cells and their bone metastatic derivatives,we identified the secreted proteins that were uniquely associated with bone metastasis in these cell lines.We then incorporated bioinformatic analyses of large clinical metastasis datasets to obtain a list of candidate novel bone metastasis proteins of several functional classes that were strongly associated with both clinical and experimental bone metastasis.Functional validation of selected proteins indicated that in vivo bone metastasis can be promoted by high expression of (1) the salivary cystatins CST1,CST2,and CST4; (2) the plasminogen activators PLAT and PLAU; or (3) the collagen functionality proteins PLOD2 and COL6A1.Overall,our study has uncovered several new secreted mediators of bone metastasis and therefore demonstrated that secretome analysis is a powerful method for identification of novel biomarkers and candidate therapeutic targets.

  10. Deep Artificial Neural Networks and Neuromorphic Chips for Big Data Analysis: Pharmaceutical and Bioinformatics Applications

    Directory of Open Access Journals (Sweden)

    Lucas Antón Pastur-Romay

    2016-08-01

    Full Text Available Over the past decade, Deep Artificial Neural Networks (DNNs have become the state-of-the-art algorithms in Machine Learning (ML, speech recognition, computer vision, natural language processing and many other tasks. This was made possible by the advancement in Big Data, Deep Learning (DL and drastically increased chip processing abilities, especially general-purpose graphical processing units (GPGPUs. All this has created a growing interest in making the most of the potential offered by DNNs in almost every field. An overview of the main architectures of DNNs, and their usefulness in Pharmacology and Bioinformatics are presented in this work. The featured applications are: drug design, virtual screening (VS, Quantitative Structure–Activity Relationship (QSAR research, protein structure prediction and genomics (and other omics data mining. The future need of neuromorphic hardware for DNNs is also discussed, and the two most advanced chips are reviewed: IBM TrueNorth and SpiNNaker. In addition, this review points out the importance of considering not only neurons, as DNNs and neuromorphic chips should also include glial cells, given the proven importance of astrocytes, a type of glial cell which contributes to information processing in the brain. The Deep Artificial Neuron–Astrocyte Networks (DANAN could overcome the difficulties in architecture design, learning process and scalability of the current ML methods.

  11. Entropy-based analysis and bioinformatics-inspired integration of global economic information transfer.

    Science.gov (United States)

    Kim, Jinkyu; Kim, Gunn; An, Sungbae; Kwon, Young-Kyun; Yoon, Sungroh

    2013-01-01

    The assessment of information transfer in the global economic network helps to understand the current environment and the outlook of an economy. Most approaches on global networks extract information transfer based mainly on a single variable. This paper establishes an entirely new bioinformatics-inspired approach to integrating information transfer derived from multiple variables and develops an international economic network accordingly. In the proposed methodology, we first construct the transfer entropies (TEs) between various intra- and inter-country pairs of economic time series variables, test their significances, and then use a weighted sum approach to aggregate information captured in each TE. Through a simulation study, the new method is shown to deliver better information integration compared to existing integration methods in that it can be applied even when intra-country variables are correlated. Empirical investigation with the real world data reveals that Western countries are more influential in the global economic network and that Japan has become less influential following the Asian currency crisis.

  12. Entropy-based analysis and bioinformatics-inspired integration of global economic information transfer.

    Directory of Open Access Journals (Sweden)

    Jinkyu Kim

    Full Text Available The assessment of information transfer in the global economic network helps to understand the current environment and the outlook of an economy. Most approaches on global networks extract information transfer based mainly on a single variable. This paper establishes an entirely new bioinformatics-inspired approach to integrating information transfer derived from multiple variables and develops an international economic network accordingly. In the proposed methodology, we first construct the transfer entropies (TEs between various intra- and inter-country pairs of economic time series variables, test their significances, and then use a weighted sum approach to aggregate information captured in each TE. Through a simulation study, the new method is shown to deliver better information integration compared to existing integration methods in that it can be applied even when intra-country variables are correlated. Empirical investigation with the real world data reveals that Western countries are more influential in the global economic network and that Japan has become less influential following the Asian currency crisis.

  13. Deep Artificial Neural Networks and Neuromorphic Chips for Big Data Analysis: Pharmaceutical and Bioinformatics Applications.

    Science.gov (United States)

    Pastur-Romay, Lucas Antón; Cedrón, Francisco; Pazos, Alejandro; Porto-Pazos, Ana Belén

    2016-01-01

    Over the past decade, Deep Artificial Neural Networks (DNNs) have become the state-of-the-art algorithms in Machine Learning (ML), speech recognition, computer vision, natural language processing and many other tasks. This was made possible by the advancement in Big Data, Deep Learning (DL) and drastically increased chip processing abilities, especially general-purpose graphical processing units (GPGPUs). All this has created a growing interest in making the most of the potential offered by DNNs in almost every field. An overview of the main architectures of DNNs, and their usefulness in Pharmacology and Bioinformatics are presented in this work. The featured applications are: drug design, virtual screening (VS), Quantitative Structure-Activity Relationship (QSAR) research, protein structure prediction and genomics (and other omics) data mining. The future need of neuromorphic hardware for DNNs is also discussed, and the two most advanced chips are reviewed: IBM TrueNorth and SpiNNaker. In addition, this review points out the importance of considering not only neurons, as DNNs and neuromorphic chips should also include glial cells, given the proven importance of astrocytes, a type of glial cell which contributes to information processing in the brain. The Deep Artificial Neuron-Astrocyte Networks (DANAN) could overcome the difficulties in architecture design, learning process and scalability of the current ML methods. PMID:27529225

  14. Cloning and bioinformatic analysis of HSPC016 gene in dermal papilla cells

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    Objective: To clone the full-length cDNA sequence of HSPC016 gene, an aggregative growth related gene in dermal papilla cells (DPC), and analyze its characteristics and predict its biological function. Methods: Rapid amplification of cDNA ends (RACE) technology was entailed to amplify the 5' and 3' sequences of HSPC016. The amplified fragments were TA-cloned, sequenced and spliced together to obtain the full-length cDNA. Its chromosome localization, domain and possible function were analyzed by bioinformatic methods. Results: Two isoforms, 400 bp and 493 bp, were obtained. The gene was mapped on chromosome 3q21. 31, and was conservative on evolution. HSPC016, a 64aa protein, belongs to PD053992 protein family and its functional domain was homologous to T2FA gene. Conclusion: HSPC016 may be related to transcriptional regulation and its protein product may act as a subunit of a transcriptional complex and play a role on DPC growth and differentiation through facilitating or suppressing other genes'transcription within the nucleus.

  15. [Cloning and bioinformatics analysis of SLA-DR genes in Hunan Shaziling pigs].

    Science.gov (United States)

    Tang, Yi-Ya; Xing, Xiao-Wei; Xue, Li-Qun; Huang, Sheng-Qiang; Wang, Wei

    2007-12-01

    In order to clone class II DRA and DRB genes of swine leukocyte antigen (SLA) in Hunan Shaziling pigs, to analyze their characteristics and polymorphism and to provide immunological basic parameters for xenotransplantation from pigs to humans. SLA-DRA and SLA-DRB genes in two Shaziling pigs with the absence of porcine endogenous retrovirus (PERV) env-c were amplified by RT-PCR, cloned into PUCm-T vectors, sequenced and analyzed through BLAST in NCBI and related software in ExPASY. The obtained SLA-DRA and SLA-DRB genes of Shaziling pigs were 1,177 and 909 nucleotides in length with their accession numbers in Genbank as EF143987 and EF143988. Bioinformatics analyses have shown that they both contain opening reading frame (ORF) and encode 252 and 266 amino acids respectively. Comparing the ORF and protein sequences of the Shaziling SLA-DRA and SLA-DRB genes with their counterpart sequences of human, the homologies of nucleotide sequences were 83% and 83%, and the homologies of amino acid sequences 83 % and 79% respectively. Further comparison with SLA sequences published in GenBank indicated that SLA-DRB gene found in Shaziling pigs has polymorphism while the homology of SLA-DRA gene is up to 100 % .

  16. Deep Artificial Neural Networks and Neuromorphic Chips for Big Data Analysis: Pharmaceutical and Bioinformatics Applications

    Science.gov (United States)

    Pastur-Romay, Lucas Antón; Cedrón, Francisco; Pazos, Alejandro; Porto-Pazos, Ana Belén

    2016-01-01

    Over the past decade, Deep Artificial Neural Networks (DNNs) have become the state-of-the-art algorithms in Machine Learning (ML), speech recognition, computer vision, natural language processing and many other tasks. This was made possible by the advancement in Big Data, Deep Learning (DL) and drastically increased chip processing abilities, especially general-purpose graphical processing units (GPGPUs). All this has created a growing interest in making the most of the potential offered by DNNs in almost every field. An overview of the main architectures of DNNs, and their usefulness in Pharmacology and Bioinformatics are presented in this work. The featured applications are: drug design, virtual screening (VS), Quantitative Structure–Activity Relationship (QSAR) research, protein structure prediction and genomics (and other omics) data mining. The future need of neuromorphic hardware for DNNs is also discussed, and the two most advanced chips are reviewed: IBM TrueNorth and SpiNNaker. In addition, this review points out the importance of considering not only neurons, as DNNs and neuromorphic chips should also include glial cells, given the proven importance of astrocytes, a type of glial cell which contributes to information processing in the brain. The Deep Artificial Neuron–Astrocyte Networks (DANAN) could overcome the difficulties in architecture design, learning process and scalability of the current ML methods. PMID:27529225

  17. A bioinformatic strategy for the detection, classification and analysis of bacterial autotransporters.

    Directory of Open Access Journals (Sweden)

    Nermin Celik

    Full Text Available Autotransporters are secreted proteins that are assembled into the outer membrane of bacterial cells. The passenger domains of autotransporters are crucial for bacterial pathogenesis, with some remaining attached to the bacterial surface while others are released by proteolysis. An enigma remains as to whether autotransporters should be considered a class of secretion system, or simply a class of substrate with peculiar requirements for their secretion. We sought to establish a sensitive search protocol that could identify and characterize diverse autotransporters from bacterial genome sequence data. The new sequence analysis pipeline identified more than 1500 autotransporter sequences from diverse bacteria, including numerous species of Chlamydiales and Fusobacteria as well as all classes of Proteobacteria. Interrogation of the proteins revealed that there are numerous classes of passenger domains beyond the known proteases, adhesins and esterases. In addition the barrel-domain-a characteristic feature of autotransporters-was found to be composed from seven conserved sequence segments that can be arranged in multiple ways in the tertiary structure of the assembled autotransporter. One of these conserved motifs overlays the targeting information required for autotransporters to reach the outer membrane. Another conserved and diagnostic motif maps to the linker region between the passenger domain and barrel-domain, indicating it as an important feature in the assembly of autotransporters.

  18. Bioinformatics analysis for structure and function ofCPR ofPlasmodium falciparum

    Institute of Scientific and Technical Information of China (English)

    ZhigangFan; Lingmin Zhang; GuogangYan; QiangWu; XiufengGan; Saifeng Zhong; GuifenLin

    2011-01-01

    Objective:To analyse the structure and function ofNADPH-cytochrome p450 reductase(CYPOR orCPR) fromPlasmodium falciparum (Pf), and to predict its’ drug target and vaccine target. Methods: The structure, function, drug target and vaccine target ofCPR fromPlasmodium falciparum were analyzed and predicted by bioinformatics methods.Results:PfCPR, which was olderCPR, had close relationship with theCPR from otherPlasmodium species, but it was distant from its hosts, such asHomo sapiens andAnopheles.PfCPR was located in the cellular nucleus ofPlasmodium falciparum.335aa-352aa and591aa -608aa were inserted the interior side of the nuclear membrane, while151aa-265aa was located in the nucleolus organizer regions.PfCPR had40 function sites and44 protein-protein binding sites in amino acid sequence. The teriary structure of 1aa-700aa was forcep-shaped with wings.15 segments ofPfCPR had no homology withHomo sapien CPR and most were exposed on the surface of the protein. These segments had25 protein-protein binding sites. While13other segments all possessed function sites. Conclusions: The evolution or genesis ofPlasmodium falciparum is earlier than those ofHomo sapiens. PfCPR is a possible resistance site of antimalarial drug and may involve immune evasion, which is associated with parasite of sporozoite in hepatocytes.PfCPR is unsuitable as vaccine target, but it has at least 13 ideal drug targets.

  19. An Introduction to Bioinformatics

    Institute of Scientific and Technical Information of China (English)

    SHENG Qi-zheng; De Moor Bart

    2004-01-01

    As a newborn interdisciplinary field, bioinformatics is receiving increasing attention from biologists, computer scientists, statisticians, mathematicians and engineers. This paper briefly introduces the birth, importance, and extensive applications of bioinformatics in the different fields of biological research. A major challenge in bioinformatics - the unraveling of gene regulation - is discussed in detail.

  20. Bioinformatic analysis of cis-regulatory interactions between progesterone and estrogen receptors in breast cancer

    Directory of Open Access Journals (Sweden)

    Matloob Khushi

    2014-11-01

    Full Text Available Chromatin factors interact with each other in a cell and sequence-specific manner in order to regulate transcription and a wealth of publically available datasets exists describing the genomic locations of these interactions. Our recently published BiSA (Binding Sites Analyser database contains transcription factor binding locations and epigenetic modifications collected from published studies and provides tools to analyse stored and imported data. Using BiSA we investigated the overlapping cis-regulatory role of estrogen receptor alpha (ERα and progesterone receptor (PR in the T-47D breast cancer cell line. We found that ERα binding sites overlap with a subset of PR binding sites. To investigate further, we re-analysed raw data to remove any biases introduced by the use of distinct tools in the original publications. We identified 22,152 PR and 18,560 ERα binding sites (<5% false discovery rate with 4,358 overlapping regions among the two datasets. BiSA statistical analysis revealed a non-significant overall overlap correlation between the two factors, suggesting that ERα and PR are not partner factors and do not require each other for binding to occur. However, Monte Carlo simulation by Binary Interval Search (BITS, Relevant Distance, Absolute Distance, Jaccard and Projection tests by Genometricorr revealed a statistically significant spatial correlation of binding regions on chromosome between the two factors. Motif analysis revealed that the shared binding regions were enriched with binding motifs for ERα, PR and a number of other transcription and pioneer factors. Some of these factors are known to co-locate with ERα and PR binding. Therefore spatially close proximity of ERα binding sites with PR binding sites suggests that ERα and PR, in general function independently at the molecular level, but that their activities converge on a specific subset of transcriptional targets.

  1. Bioinformatic analysis of expressed sequence tags from sporophyte of Porphyra yezoensis (Bagiaceae, Rhodophyta)

    Institute of Scientific and Technical Information of China (English)

    XU Minjun; MAO Yunxiang; ZHANG Xuecheng; ZHOU Xiaojun; SUI Zhenghong; ZHOU Hailin; LI Jinhong

    2006-01-01

    A total of 719 expressed sequence tags (EST) clustered into 329 non-redundant EST groups are obtained from the sporophyte cDNA library of red algae, Porphyra yezoensis. Gene Ontology (GO) analysis is employed in characterizing 60 strictest annotated unique genes out of the 329 EST groups and some domains such as COX1, Sod_ Fe-C, GST-N, SHMT, and RNase_ PH related to the enz ymes and proteins functioning in cells have been identified by HMMPFAM search. As its leafy gametophyte, the similar codon usage with strong bias is found in P. yezoensis filamentous sporophyte, regardless of some differences found in given amino acids. The average GC content of the 329 unique genes is 53.0 %. In contrast, the third nucleotide of codon exhibits a higher GC content (72 % ) than that of the first (58 % ) and the second (42 % ) nucleotides. Similarity search of the present study shows a novel EST ratio of 60.2 %,which is against the Porphyra ESTs database, suggesting further investigations towards elucidating the characteristics of Porphyra functional genome.

  2. E2F, HSF2, and miR-26 in thyroid carcinoma: bioinformatic analysis of RNA-sequencing data.

    Science.gov (United States)

    Lu, J C; Zhang, Y P

    2016-01-01

    In this study, we examined the molecular mechanism of thyroid carcinoma (THCA) using bioinformatics. RNA-sequencing data of THCA (N = 498) and normal thyroid tissue (N = 59) were downloaded from The Cancer Genome Atlas. Next, gene expression levels were calculated using the TCC package and differentially expressed genes (DEGs) were identified using the edgeR package. A co-expression network was constructed using the EBcoexpress package and visualized by Cytoscape, and functional and pathway enrichment of DEGs in the co-expression network was analyzed with DAVID and KOBAS 2.0. Moreover, modules in the co-expression network were identified and annotated using MCODE and BiNGO plugins. Small-molecule drugs were analyzed using the cMAP database, and miRNAs and transcription factors regulating DEGs were identified by WebGestalt. A total of 254 up-regulated and 59 down-regulated DEGs were identified between THCA samples and controls. DEGs enriched in biological process terms were related to cell adhesion, death, and growth and negatively correlated with various small-molecule drugs. The co-expression network of the DEGs consisted of hub genes (ITGA3, TIMP1, KRT19, and SERPINA1) and one module (JUN, FOSB, and EGR1). Furthermore, 5 miRNAs and 5 transcription factors were identified, including E2F, HSF2, and miR-26. miR-26 may participate in THCA by targeting CITED1 and PLA2R1; E2F may participate in THCA by regulating ITGA3, TIMP1, KRT19, EGR1, and JUN; HSF2 may be involved in THCA development by regulating SERPINA1 and FOSB; and small-molecule drugs may have anti-THCA effects. Our results provide novel directions for mechanistic studies and drug design of THCA. PMID:26985959

  3. The secondary metabolite bioinformatics portal

    DEFF Research Database (Denmark)

    Weber, Tilmann; Kim, Hyun Uk

    2016-01-01

    . In this context, this review gives a summary of tools and databases that currently are available to mine, identify and characterize natural product biosynthesis pathways and their producers based on ‘omics data. A web portal called Secondary Metabolite Bioinformatics Portal (SMBP at http......://www.secondarymetabolites.org) is introduced to provide a one-stop catalog and links to these bioinformatics resources. In addition, an outlook is presented how the existing tools and those to be developed will influence synthetic biology approaches in the natural products field....

  4. The Alcohol Dehydrogenase Gene Family in Melon (Cucumis melo L.: Bioinformatic Analysis and Expression Patterns

    Directory of Open Access Journals (Sweden)

    Yazhong eJin

    2016-05-01

    Full Text Available Alcohol dehydrogenases (ADH, encoded by multigene family in plants, play a critical role in plant growth, development, adaptation, fruit ripening and aroma production. Thirteen ADH genes were identified in melon genome, including 12 ADHs and one formaldehyde dehydrogenease (FDH, designated CmADH1-12 and CmFDH1, in which CmADH1 and CmADH2 have been isolated in Cantaloupe. ADH genes shared a lower identity with each other at the protein level and had different intron-exon structure at nucleotide level. No typical signal peptides were found in all CmADHs, and CmADH proteins might locate in the cytoplasm. The phylogenetic tree revealed that 13 ADH genes were divided into 3 groups respectively, namely long-, medium- and short-chain ADH subfamily, and CmADH1,3-11, which belongs to the medium-chain ADH subfamily, fell into 6 medium-chain ADH subgroups. CmADH12 may belong to the long-chain ADH subfamily, while CmFDH1 may be a Class III ADH and serve as an ancestral ADH in melon. Expression profiling revealed that CmADH1, CmADH2, CmADH10 and CmFDH1 were moderately or strongly expressed in different vegetative tissues and fruit at medium and late developmental stages, while CmADH8 and CmADH12 were highly expressed in fruit after 20 days. CmADH3 showed preferential expression in young tissues. CmADH4 only had slight expression in root. Promoter analysis revealed several motifs of CmADH genes involved in the gene expression modulated by various hormones, and the response pattern of CmADH genes to ABA, IAA and ethylene were different. These CmADHs were divided into ethylene-sensitive and –insensitive groups, and the functions of CmADHs were discussed.

  5. Cancer bioinformatics: detection of chromatin states,SNP-containing motifs, and functional enrichment modules

    Institute of Scientific and Technical Information of China (English)

    Xiaobo Zhou

    2013-01-01

    In this editorial preface,I briefly review cancer bioinformatics and introduce the four articles in this special issue highlighting important applications of the field:detection of chromatin states; detection of SNP-containing motifs and association with transcription factor-binding sites; improvements in functional enrichment modules; and gene association studies on aging and cancer.We expect this issue to provide bioinformatics scientists,cancer biologists,and clinical doctors with a better understanding of how cancer bioinformatics can be used to identify candidate biomarkers and targets and to conduct functional analysis.

  6. Cancer bioinformatics: detection of chromatin states, SNP-containing motifs, and functional enrichment modules

    Directory of Open Access Journals (Sweden)

    Xiaobo Zhou

    2013-04-01

    Full Text Available In this editorial preface, I briefly review cancer bioinformatics and introduce the four articles in this special issue highlighting important applications of the field: detection of chromatin states; detection of SNP-containing motifs and association with transcription factor-binding sites; improvements in functional enrichment modules; and gene association studies on aging and cancer. We expect this issue to provide bioinformatics scientists, cancer biologists, and clinical doctors with a better understanding of how cancer bioinformatics can be used to identify candidate biomarkers and targets and to conduct functional analysis.

  7. Analysis of Ultra-Deep Pyrosequencing and Cloning Based Sequencing of the Basic Core Promoter/Precore/Core Region of Hepatitis B Virus Using Newly Developed Bioinformatics Tools

    Science.gov (United States)

    Yousif, Mukhlid; Bell, Trevor G.; Mudawi, Hatim; Glebe, Dieter; Kramvis, Anna

    2014-01-01

    Aims The aims of this study were to develop bioinformatics tools to explore ultra-deep pyrosequencing (UDPS) data, to test these tools, and to use them to determine the optimum error threshold, and to compare results from UDPS and cloning based sequencing (CBS). Methods Four serum samples, infected with either genotype D or E, from HBeAg-positive and HBeAg-negative patients were randomly selected. UDPS and CBS were used to sequence the basic core promoter/precore region of HBV. Two online bioinformatics tools, the “Deep Threshold Tool” and the “Rosetta Tool” (http://hvdr.bioinf.wits.ac.za/tools/), were built to test and analyze the generated data. Results A total of 10952 reads were generated by UDPS on the 454 GS Junior platform. In the four samples, substitutions, detected at 0.5% threshold or above, were identified at 39 unique positions, 25 of which were non-synonymous mutations. Sample #2 (HBeAg-negative, genotype D) had substitutions in 26 positions, followed by sample #1 (HBeAg-negative, genotype E) in 12 positions, sample #3 (HBeAg-positive, genotype D) in 7 positions and sample #4 (HBeAg-positive, genotype E) in only four positions. The ratio of nucleotide substitutions between isolates from HBeAg-negative and HBeAg-positive patients was 3.5∶1. Compared to genotype E isolates, genotype D isolates showed greater variation in the X, basic core promoter/precore and core regions. Only 18 of the 39 positions identified by UDPS were detected by CBS, which detected 14 of the 25 non-synonymous mutations detected by UDPS. Conclusion UDPS data should be approached with caution. Appropriate curation of read data is required prior to analysis, in order to clean the data and eliminate artefacts. CBS detected fewer than 50% of the substitutions detected by UDPS. Furthermore it is important that the appropriate consensus (reference) sequence is used in order to identify variants correctly. PMID:24740330

  8. Analysis of ultra-deep pyrosequencing and cloning based sequencing of the basic core promoter/precore/core region of hepatitis B virus using newly developed bioinformatics tools.

    Directory of Open Access Journals (Sweden)

    Mukhlid Yousif

    Full Text Available AIMS: The aims of this study were to develop bioinformatics tools to explore ultra-deep pyrosequencing (UDPS data, to test these tools, and to use them to determine the optimum error threshold, and to compare results from UDPS and cloning based sequencing (CBS. METHODS: Four serum samples, infected with either genotype D or E, from HBeAg-positive and HBeAg-negative patients were randomly selected. UDPS and CBS were used to sequence the basic core promoter/precore region of HBV. Two online bioinformatics tools, the "Deep Threshold Tool" and the "Rosetta Tool" (http://hvdr.bioinf.wits.ac.za/tools/, were built to test and analyze the generated data. RESULTS: A total of 10952 reads were generated by UDPS on the 454 GS Junior platform. In the four samples, substitutions, detected at 0.5% threshold or above, were identified at 39 unique positions, 25 of which were non-synonymous mutations. Sample #2 (HBeAg-negative, genotype D had substitutions in 26 positions, followed by sample #1 (HBeAg-negative, genotype E in 12 positions, sample #3 (HBeAg-positive, genotype D in 7 positions and sample #4 (HBeAg-positive, genotype E in only four positions. The ratio of nucleotide substitutions between isolates from HBeAg-negative and HBeAg-positive patients was 3.5 ∶ 1. Compared to genotype E isolates, genotype D isolates showed greater variation in the X, basic core promoter/precore and core regions. Only 18 of the 39 positions identified by UDPS were detected by CBS, which detected 14 of the 25 non-synonymous mutations detected by UDPS. CONCLUSION: UDPS data should be approached with caution. Appropriate curation of read data is required prior to analysis, in order to clean the data and eliminate artefacts. CBS detected fewer than 50% of the substitutions detected by UDPS. Furthermore it is important that the appropriate consensus (reference sequence is used in order to identify variants correctly.

  9. Identification of candidate genes and mutations in QTL regions for chicken growth using bioinformatic analysis of NGS and SNP-chip data

    Directory of Open Access Journals (Sweden)

    Muhammad eAhsan

    2013-11-01

    Full Text Available Mapping of chromosomal regions harboring genetic polymorphisms that regulate complex traits is usually followed by a search for the causative mutations underlying the observed effects. This is often a challenging task even after fine mapping, as millions of base pairs including many genes will typically need to be investigated. Thus to trace the causative mutation(s there is a great need for efficient bioinformatic strategies. Here, we searched for genes and mutations regulating growth in the Virginia chicken lines – an experimental population comprising two lines that have been divergently selected for body weight at 56 days for more than 50 generations. Several QTL regions have been mapped in an F2 intercross between the lines, and the regions have subsequently been replicated and fine mapped using an Advanced Intercross Line. We have further analyzed the QTL regions where the largest genetic divergence between the High-Weight selected (HWS and Low-Weight selected (LWS lines was observed. Such regions, covering about 37% of the actual QTL regions, were identified by comparing the allele frequencies of the HWS and LWS lines using both individual 60K SNP chip genotyping of birds and analysis of read proportions from genome resequencing of DNA pools. Based on a combination of criteria including significance of the QTL, allele frequency difference of identified mutations between the selected lines, gene information on relevance for growth, and the predicted functional effects of identified mutations we propose here a subset of candidate mutations of highest priority for further evaluation in functional studies. The candidate mutations were identified within the GCG, IGFBP2, GRB14, CRIM1, FGF16, VEGFR-2, ALG11, EDN1, SNX6 and BIRC7 genes. We believe that the proposed method of combining different types of genomic information increases the probability that the genes underlying the observed QTL effects are represented among the candidate mutations

  10. Prenatal alcohol exposure alters gene expression in the rat brain: Experimental design and bioinformatic analysis of microarray data

    Directory of Open Access Journals (Sweden)

    Alexandre A. Lussier

    2015-09-01

    Full Text Available We previously identified gene expression changes in the prefrontal cortex and hippocampus of rats prenatally exposed to alcohol under both steady-state and challenge conditions (Lussier et al., 2015, Alcohol.: Clin. Exp. Res., 39, 251–261. In this study, adult female rats from three prenatal treatment groups (ad libitum-fed control, pair-fed, and ethanol-fed were injected with physiological saline solution or complete Freund׳s adjuvant (CFA to induce arthritis (adjuvant-induced arthritis, AA. The prefrontal cortex and hippocampus were collected 16 days (peak of arthritis or 39 days (during recovery following injection, and whole genome gene expression was assayed using Illumina׳s RatRef-12 expression microarray. Here, we provide additional metadata, detailed explanations of data pre-processing steps and quality control, as well as a basic framework for the bioinformatic analyses performed. The datasets from this study are publicly available on the GEO repository (accession number GSE63561.

  11. Genomic and Bioinformatics Analysis of HAdV-4, a Human Adenovirus Causing Acute Respiratory Disease: Implications for Gene Therapy and Vaccine Vector Development

    OpenAIRE

    Purkayastha, Anjan; Ditty, Susan E.; Su, Jing; McGraw, John; Hadfield, Ted L.; Tibbetts, Clark; Seto, Donald

    2005-01-01

    Human adenovirus serotype 4 (HAdV-4) is a reemerging viral pathogenic agent implicated in epidemic outbreaks of acute respiratory disease (ARD). This report presents a genomic and bioinformatics analysis of the prototype 35,990-nucleotide genome (GenBank accession no. AY594253). Intriguingly, the genome analysis suggests a closer phylogenetic relationship with the chimpanzee adenoviruses (simian adenoviruses) rather than with other human adenoviruses, suggesting a recent origin of HAdV-4, and...

  12. Factor analysis identifies subgroups of constipation

    Institute of Scientific and Technical Information of China (English)

    Philip G Dinning; Mike Jones; Linda Hunt; Sergio E Fuentealba; Jamshid Kalanter; Denis W King; David Z Lubowski; Nicholas J Talley; Ian J Cook

    2011-01-01

    AIM: To determine whether distinct symptom groupings exist in a constipated population and whether such grouping might correlate with quantifiable pathophysiological measures of colonic dysfunction. METHODS: One hundred and ninety-one patients presenting to a Gastroenterology clinic with constipation and 32 constipated patients responding to a newspaper advertisement completed a 53-item, wide-ranging selfreport questionnaire. One hundred of these patients had colonic transit measured scintigraphically. Factor analysis determined whether constipation-related symptoms grouped into distinct aspects of symptomatology. Cluster analysis was used to determine whether individual patients naturally group into distinct subtypes. RESULTS: Cluster analysis yielded a 4 cluster solution with the presence or absence of pain and laxative unresponsiveness providing the main descriptors. Amongst all clusters there was a considerable proportion of patients with demonstrable delayed colon transit, irritable bowel syndrome positive criteria and regular stool frequency. The majority of patients with these characteristics also reported regular laxative use. CONCLUSION: Factor analysis identified four constipation subgroups, based on severity and laxative unresponsiveness, in a constipated population. However, clear stratification into clinically identifiable groups remains imprecise.

  13. Analysis of RNAseq datasets from a comparative infectious disease zebrafish model using GeneTiles bioinformatics

    NARCIS (Netherlands)

    Veneman, W.J.; De Sonneville, J.; Van der Kolk, K.J.; Ordas, A.; Al-Ars, Z.; Meijer, A.H.; Spaink, M.P.

    2014-01-01

    We present a RNA deep sequencing (RNAseq) analysis of a comparison of the transcriptome responses to infection of zebrafish larvae with Staphylococcus epidermidis and Mycobacterium marinum bacteria. We show how our developed GeneTiles software can improve RNAseq analysis approaches by more confident

  14. Why plant volatile analysis needs bioinformatics - Detecting signal from noise in increasingly complex profiles

    NARCIS (Netherlands)

    Van Dam, N.M.; Poppy, G.P.

    2008-01-01

    Plant volatile analysis may be the oldest form of what now is called plant “metabolomic” analysis. A wide array of volatile organic compounds (VOCs), such as alkanes, alcohols, isoprenoids, and esters, can be collected simultaneously from the plant headspace, either within the laboratory or in the f

  15. Visualising "Junk" DNA through Bioinformatics

    Science.gov (United States)

    Elwess, Nancy L.; Latourelle, Sandra M.; Cauthorn, Olivia

    2005-01-01

    One of the hottest areas of science today is the field in which biology, information technology,and computer science are merged into a single discipline called bioinformatics. This field enables the discovery and analysis of biological data, including nucleotide and amino acid sequences that are easily accessed through the use of computers. As…

  16. Bioinformatics and the Undergraduate Curriculum

    Science.gov (United States)

    Maloney, Mark; Parker, Jeffrey; LeBlanc, Mark; Woodard, Craig T.; Glackin, Mary; Hanrahan, Michael

    2010-01-01

    Recent advances involving high-throughput techniques for data generation and analysis have made familiarity with basic bioinformatics concepts and programs a necessity in the biological sciences. Undergraduate students increasingly need training in methods related to finding and retrieving information stored in vast databases. The rapid rise of…

  17. Bioinformatics Analysis for the Antirheumatic Effects of Huang-Lian-Jie-Du-Tang from a Network Perspective

    Directory of Open Access Journals (Sweden)

    Haiyang Fang

    2013-01-01

    Full Text Available Huang-Lian-Jie-Du-Tang (HLJDT is a classic TCM formula to clear “heat” and “poison” that exhibits antirheumatic activity. Here we investigated the therapeutic mechanisms of HLJDT at protein network level using bioinformatics approach. It was found that HLJDT shares 5 target proteins with 3 types of anti-RA drugs, and several pathways in immune system and bone formation are significantly regulated by HLJDT’s components, suggesting the therapeutic effect of HLJDT on RA. By defining an antirheumatic effect score to quantitatively measure the therapeutic effect, we found that the score of each HLJDT’s component is very low, while the whole HLJDT achieves a much higher effect score, suggesting a synergistic effect of HLJDT achieved by its multiple components acting on multiple targets. At last, topological analysis on the RA-associated PPI network was conducted to illustrate key roles of HLJDT’s target proteins on this network. Integrating our findings with TCM theory suggests that HLJDT targets on hub nodes and main pathway in the Hot ZENG network, and thus it could be applied as adjuvant treatment for Hot-ZENG-related RA. This study may facilitate our understanding of antirheumatic effect of HLJDT and it may suggest new approach for the study of TCM pharmacology.

  18. Ready to use bioinformatics analysis as a tool to predict immobilisation strategies for protein direct electron transfer (DET).

    Science.gov (United States)

    Cazelles, R; Lalaoui, N; Hartmann, T; Leimkühler, S; Wollenberger, U; Antonietti, M; Cosnier, S

    2016-11-15

    Direct electron transfer (DET) to proteins is of considerable interest for the development of biosensors and bioelectrocatalysts. While protein structure is mainly used as a method of attaching the protein to the electrode surface, we employed bioinformatics analysis to predict the suitable orientation of the enzymes to promote DET. Structure similarity and secondary structure prediction were combined underlying localized amino-acids able to direct one of the enzyme's electron relays toward the electrode surface by creating a suitable bioelectrocatalytic nanostructure. The electro-polymerization of pyrene pyrrole onto a fluorine-doped tin oxide (FTO) electrode allowed the targeted orientation of the formate dehydrogenase enzyme from Rhodobacter capsulatus (RcFDH) by means of hydrophobic interactions. Its electron relays were directed to the FTO surface, thus promoting DET. The reduction of nicotinamide adenine dinucleotide (NAD(+)) generating a maximum current density of 1μAcm(-2) with 10mM NAD(+) leads to a turnover number of 0.09electron/s/molRcFDH. This work represents a practical approach to evaluate electrode surface modification strategies in order to create valuable bioelectrocatalysts.

  19. A bioinformatics analysis of Lamin-A regulatory network: a perspective on epigenetic involvement in Hutchinson-Gilford progeria syndrome.

    Science.gov (United States)

    Arancio, Walter

    2012-04-01

    Hutchinson-Gilford progeria syndrome (HGPS) is a rare human genetic disease that leads to premature aging. HGPS is caused by mutation in the Lamin-A (LMNA) gene that leads, in affected young individuals, to the accumulation of the progerin protein, usually present only in aging differentiated cells. Bioinformatics analyses of the network of interactions of the LMNA gene and transcripts are presented. The LMNA gene network has been analyzed using the BioGRID database (http://thebiogrid.org/) and related analysis tools such as Osprey (http://biodata.mshri.on.ca/osprey/servlet/Index) and GeneMANIA ( http://genemania.org/). The network of interaction of LMNA transcripts has been further analyzed following the competing endogenous (ceRNA) hypotheses (RNA cross-talk via microRNAs [miRNAs]) and using the miRWalk database and tools (www.ma.uni-heidelberg.de/apps/zmf/mirwalk/). These analyses suggest particular relevance of epigenetic modifiers (via acetylase complexes and specifically HTATIP histone acetylase) and adenosine triphosphate (ATP)-dependent chromatin remodelers (via pBAF, BAF, and SWI/SNF complexes). PMID:22533413

  20. Ready to use bioinformatics analysis as a tool to predict immobilisation strategies for protein direct electron transfer (DET).

    Science.gov (United States)

    Cazelles, R; Lalaoui, N; Hartmann, T; Leimkühler, S; Wollenberger, U; Antonietti, M; Cosnier, S

    2016-11-15

    Direct electron transfer (DET) to proteins is of considerable interest for the development of biosensors and bioelectrocatalysts. While protein structure is mainly used as a method of attaching the protein to the electrode surface, we employed bioinformatics analysis to predict the suitable orientation of the enzymes to promote DET. Structure similarity and secondary structure prediction were combined underlying localized amino-acids able to direct one of the enzyme's electron relays toward the electrode surface by creating a suitable bioelectrocatalytic nanostructure. The electro-polymerization of pyrene pyrrole onto a fluorine-doped tin oxide (FTO) electrode allowed the targeted orientation of the formate dehydrogenase enzyme from Rhodobacter capsulatus (RcFDH) by means of hydrophobic interactions. Its electron relays were directed to the FTO surface, thus promoting DET. The reduction of nicotinamide adenine dinucleotide (NAD(+)) generating a maximum current density of 1μAcm(-2) with 10mM NAD(+) leads to a turnover number of 0.09electron/s/molRcFDH. This work represents a practical approach to evaluate electrode surface modification strategies in order to create valuable bioelectrocatalysts. PMID:27156017

  1. Identification of complex metabolic states in critically injured patients using bioinformatic cluster analysis

    OpenAIRE

    Cohen, Mitchell J; Grossman, Adam D; Morabito, Diane; Knudson, M. Margaret; Butte, Atul J; Manley, Geoffrey T.

    2010-01-01

    Introduction Advances in technology have made extensive monitoring of patient physiology the standard of care in intensive care units (ICUs). While many systems exist to compile these data, there has been no systematic multivariate analysis and categorization across patient physiological data. The sheer volume and complexity of these data make pattern recognition or identification of patient state difficult. Hierarchical cluster analysis allows visualization of high dimensional data and enabl...

  2. Deep Learning in Bioinformatics

    OpenAIRE

    Min, Seonwoo; Lee, Byunghan; Yoon, Sungroh

    2016-01-01

    In the era of big data, transformation of biomedical big data into valuable knowledge has been one of the most important challenges in bioinformatics. Deep learning has advanced rapidly since the early 2000s and now demonstrates state-of-the-art performance in various fields. Accordingly, application of deep learning in bioinformatics to gain insight from data has been emphasized in both academia and industry. Here, we review deep learning in bioinformatics, presenting examples of current res...

  3. Antimicrobial Protein Candidates from the Thermophilic Geobacillus sp. Strain ZGt-1: Production, Proteomics, and Bioinformatics Analysis

    Science.gov (United States)

    Alkhalili, Rawana N.; Bernfur, Katja; Dishisha, Tarek; Mamo, Gashaw; Schelin, Jenny; Canbäck, Björn; Emanuelsson, Cecilia; Hatti-Kaul, Rajni

    2016-01-01

    A thermophilic bacterial strain, Geobacillus sp. ZGt-1, isolated from Zara hot spring in Jordan, was capable of inhibiting the growth of the thermophilic G. stearothermophilus and the mesophilic Bacillus subtilis and Salmonella typhimurium on a solid cultivation medium. Antibacterial activity was not observed when ZGt-1 was cultivated in a liquid medium; however, immobilization of the cells in agar beads that were subjected to sequential batch cultivation in the liquid medium at 60 °C showed increasing antibacterial activity up to 14 cycles. The antibacterial activity was lost on protease treatment of the culture supernatant. Concentration of the protein fraction by ammonium sulphate precipitation followed by denaturing polyacrylamide gel electrophoresis separation and analysis of the gel for antibacterial activity against G. stearothermophilus showed a distinct inhibition zone in 15–20 kDa range, suggesting that the active molecule(s) are resistant to denaturation by SDS. Mass spectrometric analysis of the protein bands around the active region resulted in identification of 22 proteins with molecular weight in the range of interest, three of which were new and are here proposed as potential antimicrobial protein candidates by in silico analysis of their amino acid sequences. Mass spectrometric analysis also indicated the presence of partial sequences of antimicrobial enzymes, amidase and dd-carboxypeptidase. PMID:27548162

  4. Antimicrobial Protein Candidates from the Thermophilic Geobacillus sp. Strain ZGt-1: Production, Proteomics, and Bioinformatics Analysis

    Directory of Open Access Journals (Sweden)

    Rawana N. Alkhalili

    2016-08-01

    Full Text Available A thermophilic bacterial strain, Geobacillus sp. ZGt-1, isolated from Zara hot spring in Jordan, was capable of inhibiting the growth of the thermophilic G. stearothermophilus and the mesophilic Bacillus subtilis and Salmonella typhimurium on a solid cultivation medium. Antibacterial activity was not observed when ZGt-1 was cultivated in a liquid medium; however, immobilization of the cells in agar beads that were subjected to sequential batch cultivation in the liquid medium at 60 °C showed increasing antibacterial activity up to 14 cycles. The antibacterial activity was lost on protease treatment of the culture supernatant. Concentration of the protein fraction by ammonium sulphate precipitation followed by denaturing polyacrylamide gel electrophoresis separation and analysis of the gel for antibacterial activity against G. stearothermophilus showed a distinct inhibition zone in 15–20 kDa range, suggesting that the active molecule(s are resistant to denaturation by SDS. Mass spectrometric analysis of the protein bands around the active region resulted in identification of 22 proteins with molecular weight in the range of interest, three of which were new and are here proposed as potential antimicrobial protein candidates by in silico analysis of their amino acid sequences. Mass spectrometric analysis also indicated the presence of partial sequences of antimicrobial enzymes, amidase and dd-carboxypeptidase.

  5. Antimicrobial Protein Candidates from the Thermophilic Geobacillus sp. Strain ZGt-1: Production, Proteomics, and Bioinformatics Analysis.

    Science.gov (United States)

    Alkhalili, Rawana N; Bernfur, Katja; Dishisha, Tarek; Mamo, Gashaw; Schelin, Jenny; Canbäck, Björn; Emanuelsson, Cecilia; Hatti-Kaul, Rajni

    2016-01-01

    A thermophilic bacterial strain, Geobacillus sp. ZGt-1, isolated from Zara hot spring in Jordan, was capable of inhibiting the growth of the thermophilic G. stearothermophilus and the mesophilic Bacillus subtilis and Salmonella typhimurium on a solid cultivation medium. Antibacterial activity was not observed when ZGt-1 was cultivated in a liquid medium; however, immobilization of the cells in agar beads that were subjected to sequential batch cultivation in the liquid medium at 60 °C showed increasing antibacterial activity up to 14 cycles. The antibacterial activity was lost on protease treatment of the culture supernatant. Concentration of the protein fraction by ammonium sulphate precipitation followed by denaturing polyacrylamide gel electrophoresis separation and analysis of the gel for antibacterial activity against G. stearothermophilus showed a distinct inhibition zone in 15-20 kDa range, suggesting that the active molecule(s) are resistant to denaturation by SDS. Mass spectrometric analysis of the protein bands around the active region resulted in identification of 22 proteins with molecular weight in the range of interest, three of which were new and are here proposed as potential antimicrobial protein candidates by in silico analysis of their amino acid sequences. Mass spectrometric analysis also indicated the presence of partial sequences of antimicrobial enzymes, amidase, and dd-carboxypeptidase. PMID:27548162

  6. Identification and bioinformatics analysis of microRNAs from the sporophyte and gametophyte of Pyropia haitanensis

    Science.gov (United States)

    Huang, Aiyou; Wang, Guangce

    2016-05-01

    Pyropia haitanensis (T. J. Chang et B. F. Zheng) N. Kikuchi et M. Miyata ( Porphyra haitanensis) is an economically important genus that is cultured widely in China. P. haitanensis is cultured on a larger scale than Pyropia yezoensis, making up an important part of the total production of cultivated Pyropia in China. However, the majority of molecular mechanisms underlying the physiological processes of P. haitanensis remain unknown. P. haitanensis could utilize inorganic carbon and the sporophytes of P. haitanensis might possess a PCK-type C4-like carbon-fixation pathway. To identify microRNAs and their probable roles in sporophyte and gametophyte development, we constructed and sequenced small RNA libraries from sporophytes and gametophytes of P. haitanensis. Five microRNAs were identified that shared no sequence homology with known microRNAs. Our results indicated that P. haitanensis might posses a complex sRNA processing system in which the novel microRNAs act as important regulators of the development of different generations of P. haitanensis.

  7. Cloning, identification, and bioinformatics analysis of a putative aquaporin TsAQP from Trichinella spiralis.

    Science.gov (United States)

    Cui, J M; Zhang, N Z; Li, W H; Yan, H B; Fu, B Q

    2015-01-01

    Vaccination as a preventative strategy against Trichinella spiralis infection is an ongoing effort, although no ideal vaccine candidates have been identified until now. Identification of more effective antigens that have a role in essential life stages of the parasite and that may be effective vaccine candidates is therefore of importance. In the present study, we identified a novel aquaporin gene (TsAQP) from T. spiralis, and the potential antigenicity of TsAQP was evaluated by epitope prediction. A total of 11 post-translational modification sites were predicted in the protein and fell into 4 categories: N-glycosylation; casein kinase II phosphorylation; protein kinase C phosphorylation; and N-myristoylation sites. TsAQP is a membrane intrinsic protein with high hydrophobicity; the main hydrophobic domains comprised up to 38.5% of the protein and were distributed at amino acid positions 21-43, 54-71, 83-91, 107-121, 163-174, 187-200, and 242-261. The protein consisted mainly of helices (39.58%) and loops (50%). The advanced structure of TsAQP was predicted using homology modeling, which showed that the protein was formed from 6 membrane-spanning domains connected by 5 loops. Based on these analyses, 6 potential B-cell epitopes and 4 potential T-cell epitopes were further predicted. These results suggest that TsAQP could be a promising antigen candidate for vaccination against T. spiralis. PMID:26505421

  8. The haloarchaeal MCM proteins: bioinformatic analysis and targeted mutagenesis of the β7-β8 and β9-β10 hairpin loops and conserved zinc binding domain cysteines

    Directory of Open Access Journals (Sweden)

    Tatjana P Kristensen

    2014-03-01

    Full Text Available The hexameric MCM complex is the catalytic core of the replicative helicase in eukaryotic and archaeal cells. Here we describe the first in vivo analysis of archaeal MCM protein structure and function relationships using the genetically tractable haloarchaeon Haloferax volcanii as a model system. Hfx. volcanii encodes a single MCM protein that is part of the previously identified core group of haloarchaeal MCM proteins. Three structural features of the N-terminal domain of the Hfx. volcanii MCM protein were targeted for mutagenesis: the β7-β8 and β9-β10 β-hairpin loops and putative zinc binding domain. Five strains carrying single point mutations in the β7-β8 β-hairpin loop were constructed, none of which displayed impaired cell growth under normal conditions or when treated with the DNA damaging agent mitomycin C. However, short sequence deletions within the β7-β8 β-hairpin were not tolerated and neither was replacement of the highly conserved residue glutamate 187 with alanine. Six strains carrying paired alanine substitutions within the β9-β10 β-hairpin loop were constructed, leading to the conclusion that no individual amino acid within that hairpin loop is absolutely required for MCM function, although one of the mutant strains displays greatly enhanced sensitivity to mitomycin C. Deletions of two or four amino acids from the β9-β10 β-hairpin were tolerated but mutants carrying larger deletions were inviable. Similarly, it was not possible to construct mutants in which any of the conserved zinc binding cysteines was replaced with alanine, underlining the likely importance of zinc binding for MCM function. The results of these studies demonstrate the feasibility of using Hfx. volcanii as a model system for reverse genetic analysis of archaeal MCM protein function and provide important confirmation of the in vivo importance of conserved structural features identified by previous bioinformatic, biochemical and structural

  9. Bioinformatics analysis of the molecular mechanism of diffuse intrinsic pontine glioma

    Science.gov (United States)

    Deng, Lei; Xiong, Pengju; Luo, Yunhui; Bu, Xiao; Qian, Suokai; Zhong, Wuzhao

    2016-01-01

    The present study aimed to elucidate key molecular mechanisms in the progression of diffuse intrinsic pontine glioma (DIPG). The gene expression profile GSE50021, which consisted of 35 pediatric DIPG samples and 10 normal brain samples, was downloaded from the Gene Expression Omnibus database. The differentially-expressed genes (DEGs) in the pediatric DIPG samples were identified. Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) and Reactome pathways of DEGs were enriched and analyzed. The protein-protein interaction (PPI) network of the DEGs was constructed and functional modules of the PPI network were disclosed using ClusterONE. A total of 679 DEGs (454 up- and 225 downregulated) were identified in the pediatric DIPG samples. DEGs were significantly enriched in various GO terms, and KEGG and Reactome pathways. The PPI network of upregulated (153 nodes and 298 connections) and downregulated (71 nodes and 124 connections) DEGs, and two crucial modules, were obtained. Downregulated genes in module 2, such as cholecystokinin (CCK), gastrin (GAST), adenylate cyclase 2 (brain) (ADCY2) and 5-hydroxytryptamine (serotonin) receptor 7 (HTR7), were significantly enriched in the calcium signaling pathway, the neuroactive ligand-receptor interaction pathway and in GO terms, such as the G-protein coupled receptor (GPCR) signaling pathway, while upregulated genes in module 1 were not enriched in any pathways or GO terms. CCK and GAST associated with the GPCR signaling pathway, HTR7 enriched in the neuroactive ligand-receptor interaction, and ADCY2 and HTR7 involved in the calcium signaling pathway may be key mechanisms playing crucial roles in the development and progression of DIPG.

  10. Toward the Replacement of Animal Experiments through the Bioinformatics-driven Analysis of 'Omics' Data from Human Cell Cultures.

    Science.gov (United States)

    Grafström, Roland C; Nymark, Penny; Hongisto, Vesa; Spjuth, Ola; Ceder, Rebecca; Willighagen, Egon; Hardy, Barry; Kaski, Samuel; Kohonen, Pekka

    2015-11-01

    This paper outlines the work for which Roland Grafström and Pekka Kohonen were awarded the 2014 Lush Science Prize. The research activities of the Grafström laboratory have, for many years, covered cancer biology studies, as well as the development and application of toxicity-predictive in vitro models to determine chemical safety. Through the integration of in silico analyses of diverse types of genomics data (transcriptomic and proteomic), their efforts have proved to fit well into the recently-developed Adverse Outcome Pathway paradigm. Genomics analysis within state-of-the-art cancer biology research and Toxicology in the 21st Century concepts share many technological tools. A key category within the Three Rs paradigm is the Replacement of animals in toxicity testing with alternative methods, such as bioinformatics-driven analyses of data obtained from human cell cultures exposed to diverse toxicants. This work was recently expanded within the pan-European SEURAT-1 project (Safety Evaluation Ultimately Replacing Animal Testing), to replace repeat-dose toxicity testing with data-rich analyses of sophisticated cell culture models. The aims and objectives of the SEURAT project have been to guide the application, analysis, interpretation and storage of 'omics' technology-derived data within the service-oriented sub-project, ToxBank. Particularly addressing the Lush Science Prize focus on the relevance of toxicity pathways, a 'data warehouse' that is under continuous expansion, coupled with the development of novel data storage and management methods for toxicology, serve to address data integration across multiple 'omics' technologies. The prize winners' guiding principles and concepts for modern knowledge management of toxicological data are summarised. The translation of basic discovery results ranged from chemical-testing and material-testing data, to information relevant to human health and environmental safety.

  11. Toward the Replacement of Animal Experiments through the Bioinformatics-driven Analysis of 'Omics' Data from Human Cell Cultures.

    Science.gov (United States)

    Grafström, Roland C; Nymark, Penny; Hongisto, Vesa; Spjuth, Ola; Ceder, Rebecca; Willighagen, Egon; Hardy, Barry; Kaski, Samuel; Kohonen, Pekka

    2015-11-01

    This paper outlines the work for which Roland Grafström and Pekka Kohonen were awarded the 2014 Lush Science Prize. The research activities of the Grafström laboratory have, for many years, covered cancer biology studies, as well as the development and application of toxicity-predictive in vitro models to determine chemical safety. Through the integration of in silico analyses of diverse types of genomics data (transcriptomic and proteomic), their efforts have proved to fit well into the recently-developed Adverse Outcome Pathway paradigm. Genomics analysis within state-of-the-art cancer biology research and Toxicology in the 21st Century concepts share many technological tools. A key category within the Three Rs paradigm is the Replacement of animals in toxicity testing with alternative methods, such as bioinformatics-driven analyses of data obtained from human cell cultures exposed to diverse toxicants. This work was recently expanded within the pan-European SEURAT-1 project (Safety Evaluation Ultimately Replacing Animal Testing), to replace repeat-dose toxicity testing with data-rich analyses of sophisticated cell culture models. The aims and objectives of the SEURAT project have been to guide the application, analysis, interpretation and storage of 'omics' technology-derived data within the service-oriented sub-project, ToxBank. Particularly addressing the Lush Science Prize focus on the relevance of toxicity pathways, a 'data warehouse' that is under continuous expansion, coupled with the development of novel data storage and management methods for toxicology, serve to address data integration across multiple 'omics' technologies. The prize winners' guiding principles and concepts for modern knowledge management of toxicological data are summarised. The translation of basic discovery results ranged from chemical-testing and material-testing data, to information relevant to human health and environmental safety. PMID:26551289

  12. BIOINFORMATICS AND BIOSYNTHESIS ANALYSIS OF CELLULOSE SYNTHASE OPERON IN ZYMOMONAS MOBILIS ZM4

    OpenAIRE

    Sheik Abdul Kader Sheik Asraf, K. Narayanan Rajnish, and Paramasamy Gunasekaran

    2011-01-01

    Biosynthesis of cellulose has been reported in many species of bacteria. The genes encoding cellulose biosynthetic enzymes of Z. mobilis have not been studied so far. Preliminary sequence analysis of the Z. mobilis ZM4 genome revealed the presence of a cellulose synthase operon comprised of Open Reading Frames (ORFs) ZMO01083 (bcsA), ZMO1084 (bcsB) and ZMO1085 (bcsC). The first gene of the operon bcsA encodes the cellulose synthase catalytic subunit BcsA. The second gene of the operon bcsB en...

  13. Bioinformatics analysis of organizational and expressional characterizations of the IFNs, IRFs and CRFBs in grass carp Ctenopharyngodon idella.

    Science.gov (United States)

    Liao, Zhiwei; Wan, Quanyuan; Su, Jianguo

    2016-08-01

    Interferons (IFNs) play crucial roles in the immune response of defense against viral infection and bacteria invasion. In the present study, we systematically identified and characterized the IFNs, their regulatory factors (Interferon Regulatory Factors, IRFs) and receptors (Cytokine Receptor Family B, CRFBs) in grass carp (Ctenopharyngodon idella). Grass carp IFNs can be classified into type I IFN (IFN-I) and type II IFN (IFN-II) like other teleosts. IFN-I consist of two groups with two (group I) or four (group II) cysteines in the mature peptide and can be further divided into three subgroups (IFN-a, -c and -d), containing four members: IFN1, IFN2, IFN3, IFN4 in grass carp. IFN-II contain two members, IFNγ2 with the similarity to mammalian IFNγ and a cyprinid specific IFNγ1 (IFNγ-rel) molecule. mRNA expression analyses of IFNs discovered that IFN1 and IFN-II were sustainably expressed in many tissues, while other IFN members were transiently expressed in specific tissues and time points. In the immune response, IFN transcriptions are primarily regulated through multiple IRFs after grass carp reovirus (GCRV) challenge. IRF family possess thirteen members in grass carp, which can be further divided into four subfamilies (IRF-1, -3, -4 and -5 subfamily), each of them plays different roles in the innate and adaptive immunity via various signaling pathways to interact with IFNs (mainly IFN-I). IFNs have to bind receptors (CRFBs) to perform their functions. CRFBs as IFN receptors contain six members in grass carp. The structure and expression characterizations of IFNs, IRFs and CRFBs were analyzed using bioinformatics tools. These results might provide basic data for the further functional research of IFN system, and deeply understand fish immune mechanisms against virus infection. PMID:27012995

  14. Bioinformatics clouds for big data manipulation

    KAUST Repository

    Dai, Lin

    2012-11-28

    As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics.This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor. 2012 Dai et al.; licensee BioMed Central Ltd.

  15. Bioinformatics clouds for big data manipulation

    Directory of Open Access Journals (Sweden)

    Dai Lin

    2012-11-01

    Full Text Available Abstract As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS, Software as a Service (SaaS, Platform as a Service (PaaS, and Infrastructure as a Service (IaaS, and present our perspectives on the adoption of cloud computing in bioinformatics. Reviewers This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor.

  16. Meta-Analysis of Placental Transcriptome Data Identifies a Novel Molecular Pathway Related to Preeclampsia.

    Science.gov (United States)

    van Uitert, Miranda; Moerland, Perry D; Enquobahrie, Daniel A; Laivuori, Hannele; van der Post, Joris A M; Ris-Stalpers, Carrie; Afink, Gijs B

    2015-01-01

    Studies using the placental transcriptome to identify key molecules relevant for preeclampsia are hampered by a relatively small sample size. In addition, they use a variety of bioinformatics and statistical methods, making comparison of findings challenging. To generate a more robust preeclampsia gene expression signature, we performed a meta-analysis on the original data of 11 placenta RNA microarray experiments, representing 139 normotensive and 116 preeclamptic pregnancies. Microarray data were pre-processed and analyzed using standardized bioinformatics and statistical procedures and the effect sizes were combined using an inverse-variance random-effects model. Interactions between genes in the resulting gene expression signature were identified by pathway analysis (Ingenuity Pathway Analysis, Gene Set Enrichment Analysis, Graphite) and protein-protein associations (STRING). This approach has resulted in a comprehensive list of differentially expressed genes that led to a 388-gene meta-signature of preeclamptic placenta. Pathway analysis highlights the involvement of the previously identified hypoxia/HIF1A pathway in the establishment of the preeclamptic gene expression profile, while analysis of protein interaction networks indicates CREBBP/EP300 as a novel element central to the preeclamptic placental transcriptome. In addition, there is an apparent high incidence of preeclampsia in women carrying a child with a mutation in CREBBP/EP300 (Rubinstein-Taybi Syndrome). The 388-gene preeclampsia meta-signature offers a vital starting point for further studies into the relevance of these genes (in particular CREBBP/EP300) and their concomitant pathways as biomarkers or functional molecules in preeclampsia. This will result in a better understanding of the molecular basis of this disease and opens up the opportunity to develop rational therapies targeting the placental dysfunction causal to preeclampsia. PMID:26171964

  17. Bioinformatic analysis reveals high diversity of bacterial genes for laccase-like enzymes.

    Directory of Open Access Journals (Sweden)

    Luka Ausec

    Full Text Available Fungal laccases have been used in various fields ranging from processes in wood and paper industries to environmental applications. Although a few bacterial laccases have been characterized in recent years, prokaryotes have largely been neglected as a source of novel enzymes, in part due to the lack of knowledge about the diversity and distribution of laccases within Bacteria. In this work genes for laccase-like enzymes were searched for in over 2,200 complete and draft bacterial genomes and four metagenomic datasets, using the custom profile Hidden Markov Models for two- and three-domain laccases. More than 1,200 putative genes for laccase-like enzymes were retrieved from chromosomes and plasmids of diverse bacteria. In 76% of the genes, signal peptides were predicted, indicating that these bacterial laccases may be exported from the cytoplasm, which contrasts with the current belief. Moreover, several examples of putatively horizontally transferred bacterial laccase genes were described. Many metagenomic sequences encoding fragments of laccase-like enzymes could not be phylogenetically assigned, indicating considerable novelty. Laccase-like genes were also found in anaerobic bacteria, autotrophs and alkaliphiles, thus opening new hypotheses regarding their ecological functions. Bacteria identified as carrying laccase genes represent potential sources for future biotechnological applications.

  18. Bioinformatic Analysis Reveals High Diversity of Bacterial Genes for Laccase-Like Enzymes

    Science.gov (United States)

    Ausec, Luka; Zakrzewski, Martha; Goesmann, Alexander; Schlüter, Andreas; Mandic-Mulec, Ines

    2011-01-01

    Fungal laccases have been used in various fields ranging from processes in wood and paper industries to environmental applications. Although a few bacterial laccases have been characterized in recent years, prokaryotes have largely been neglected as a source of novel enzymes, in part due to the lack of knowledge about the diversity and distribution of laccases within Bacteria. In this work genes for laccase-like enzymes were searched for in over 2,200 complete and draft bacterial genomes and four metagenomic datasets, using the custom profile Hidden Markov Models for two- and three- domain laccases. More than 1,200 putative genes for laccase-like enzymes were retrieved from chromosomes and plasmids of diverse bacteria. In 76% of the genes, signal peptides were predicted, indicating that these bacterial laccases may be exported from the cytoplasm, which contrasts with the current belief. Moreover, several examples of putatively horizontally transferred bacterial laccase genes were described. Many metagenomic sequences encoding fragments of laccase-like enzymes could not be phylogenetically assigned, indicating considerable novelty. Laccase-like genes were also found in anaerobic bacteria, autotrophs and alkaliphiles, thus opening new hypotheses regarding their ecological functions. Bacteria identified as carrying laccase genes represent potential sources for future biotechnological applications. PMID:22022440

  19. Bioinformatic analysis of the neprilysin (M13 family of peptidases reveals complex evolutionary and functional relationships

    Directory of Open Access Journals (Sweden)

    Pinney John W

    2008-01-01

    Full Text Available Abstract Background The neprilysin (M13 family of endopeptidases are zinc-metalloenzymes, the majority of which are type II integral membrane proteins. The best characterised of this family is neprilysin, which has important roles in inactivating signalling peptides involved in modulating neuronal activity, blood pressure and the immune system. Other family members include the endothelin converting enzymes (ECE-1 and ECE-2, which are responsible for the final step in the synthesis of potent vasoconstrictor endothelins. The ECEs, as well as neprilysin, are considered valuable therapeutic targets for treating cardiovascular disease. Other members of the M13 family have not been functionally characterised, but are also likely to have biological roles regulating peptide signalling. The recent sequencing of animal genomes has greatly increased the number of M13 family members in protein databases, information which can be used to reveal evolutionary relationships and to gain insight into conserved biological roles. Results The phylogenetic analysis successfully resolved vertebrate M13 peptidases into seven classes, one of which appears to be specific to mammals, and insect genes into five functional classes and a series of expansions, which may include inactive peptidases. Nematode genes primarily resolved into groups containing no other taxa, bar the two nematode genes associated with Drosophila DmeNEP1 and DmeNEP4. This analysis reconstructed only one relationship between chordate and invertebrate clusters, that of the ECE sub-group and the DmeNEP3 related genes. Analysis of amino acid utilisation in the active site of M13 peptidases reveals a basis for their biochemical properties. A relatively invariant S1' subsite gives the majority of M13 peptidases their strong preference for hydrophobic residues in P1' position. The greater variation in the S2' subsite may be instrumental in determining the specificity of M13 peptidases for their substrates

  20. A Polyglot Approach to Bioinformatics Data Integration: A Phylogenetic Analysis of HIV-1.

    Science.gov (United States)

    Reisman, Steven; Hatzopoulos, Thomas; Läufer, Konstantin; Thiruvathukal, George K; Putonti, Catherine

    2016-01-01

    As sequencing technologies continue to drop in price and increase in throughput, new challenges emerge for the management and accessibility of genomic sequence data. We have developed a pipeline for facilitating the storage, retrieval, and subsequent analysis of molecular data, integrating both sequence and metadata. Taking a polyglot approach involving multiple languages, libraries, and persistence mechanisms, sequence data can be aggregated from publicly available and local repositories. Data are exposed in the form of a RESTful web service, formatted for easy querying, and retrieved for downstream analyses. As a proof of concept, we have developed a resource for annotated HIV-1 sequences. Phylogenetic analyses were conducted for >6,000 HIV-1 sequences revealing spatial and temporal factors influence the evolution of the individual genes uniquely. Nevertheless, signatures of origin can be extrapolated even despite increased globalization. The approach developed here can easily be customized for any species of interest. PMID:26819543

  1. Bioinformatics analysis of breast cancer bone metastasis related geneCXCR4

    Institute of Scientific and Technical Information of China (English)

    Heng-Wei; Zhang; Xian-Fu; Sun; Ya-Ning; He; Jun-Tao; Li; Xu-Hui; Guo; Hui; Liu

    2013-01-01

    Objective:To analyze breast cancer bone metastasis related gene-CXCR4.Methods:This research screened breast cancer bone metastasis related genes by high-flux gene chip.Results:It was found that the expressions of 396 genes were different including 165 up-regulations and 231 down-regulations.The expression of chemokine receptor CXCR4 was obviously upregulated in the tissue with breast cancer bone metastasis.Compared with the tissue without hone metastasis,there was significant difference,which indicated that CXCR4 played a vital role in breast cancer bone metastasis.Conclusions:The hioinformatics analysis of CXCR4 can provide a certain basis for the occurrence and diagnosis of breast cancer bone metastasis,target gene therapy and evaluation of prognosis.

  2. Flux Analysis of the Trypanosoma brucei Glycolysis Based on a Multiobjective-Criteria Bioinformatic Approach

    Directory of Open Access Journals (Sweden)

    Amine Ghozlane

    2012-01-01

    Full Text Available Trypanosoma brucei is a protozoan parasite of major of interest in discovering new genes for drug targets. This parasite alternates its life cycle between the mammal host(s (bloodstream form and the insect vector (procyclic form, with two divergent glucose metabolism amenable to in vitro culture. While the metabolic network of the bloodstream forms has been well characterized, the flux distribution between the different branches of the glucose metabolic network in the procyclic form has not been addressed so far. We present a computational analysis (called Metaboflux that exploits the metabolic topology of the procyclic form, and allows the incorporation of multipurpose experimental data to increase the biological relevance of the model. The alternatives resulting from the structural complexity of networks are formulated as an optimization problem solved by a metaheuristic where experimental data are modeled in a multiobjective function. Our results show that the current metabolic model is in agreement with experimental data and confirms the observed high metabolic flexibility of glucose metabolism. In addition, Metaboflux offers a rational explanation for the high flexibility in the ratio between final products from glucose metabolism, thsat is, flux redistribution through the malic enzyme steps.

  3. Isolation and bioinformatics analysis of differentially methylated genomic fragments in human gastric cancer

    Institute of Scientific and Technical Information of China (English)

    Ai-Jun Liao; Qi Su; Xun Wang; Bin Zeng; Wei Shi

    2008-01-01

    AIM:To isolate and analyze the DNA sequences which are methylated differentially between gastric cancer and normal gastric mucosa.METHODS:The differentially methylated DNA sequences between gastric cancer and normal gastric mucosa were isolated by methylation-sensitive representational difference analysis (MS-RDA).Similarities between the separated fragments and the human genomic DNA were analyzed with Basic Local Alignment Search Tool (BLAST).RESULTS:Three differentially methylated DNA sequences were obtained,two of which have been accepted by GenBank.The accession numbers are AY887106 and AY887107.AY887107 was highly similar to the 11th exon of LOC440683 (98%),3'end of LOC440887 (99%),and promoter and exon regions of DRD5 (94%).AY887106 was consistent (98%) with a CpG island in ribosomal RNA isolated from colorectal cancer by Minoru Toyota in 1999.CONCLUSION:The methylation degree is different between gastric cancer and normal gastric mucosa.The differentially methylated DNA sequences can be isolated effectively by MS-RDA.

  4. Bioinformatic evaluation of L-arginine catabolic pathways in 24 cyanobacteria and transcriptional analysis of genes encoding enzymes of L-arginine catabolism in the cyanobacterium Synechocystis sp. PCC 6803

    Directory of Open Access Journals (Sweden)

    Pistorius Elfriede K

    2007-11-01

    Full Text Available Abstract Background So far very limited knowledge exists on L-arginine catabolism in cyanobacteria, although six major L-arginine-degrading pathways have been described for prokaryotes. Thus, we have performed a bioinformatic analysis of possible L-arginine-degrading pathways in cyanobacteria. Further, we chose Synechocystis sp. PCC 6803 for a more detailed bioinformatic analysis and for validation of the bioinformatic predictions on L-arginine catabolism with a transcript analysis. Results We have evaluated 24 cyanobacterial genomes of freshwater or marine strains for the presence of putative L-arginine-degrading enzymes. We identified an L-arginine decarboxylase pathway in all 24 strains. In addition, cyanobacteria have one or two further pathways representing either an arginase pathway or L-arginine deiminase pathway or an L-arginine oxidase/dehydrogenase pathway. An L-arginine amidinotransferase pathway as a major L-arginine-degrading pathway is not likely but can not be entirely excluded. A rather unusual finding was that the cyanobacterial L-arginine deiminases are substantially larger than the enzymes in non-photosynthetic bacteria and that they are membrane-bound. A more detailed bioinformatic analysis of Synechocystis sp. PCC 6803 revealed that three different L-arginine-degrading pathways may in principle be functional in this cyanobacterium. These are (i an L-arginine decarboxylase pathway, (ii an L-arginine deiminase pathway, and (iii an L-arginine oxidase/dehydrogenase pathway. A transcript analysis of cells grown either with nitrate or L-arginine as sole N-source and with an illumination of 50 μmol photons m-2 s-1 showed that the transcripts for the first enzyme(s of all three pathways were present, but that the transcript levels for the L-arginine deiminase and the L-arginine oxidase/dehydrogenase were substantially higher than that of the three isoenzymes of L-arginine decarboxylase. Conclusion The evaluation of 24

  5. Bioinformatic analysis of microRNA networks following the activation of the constitutive androstane receptor (CAR) in mouse liver.

    Science.gov (United States)

    Hao, Ruixin; Su, Shengzhong; Wan, Yinan; Shen, Frank; Niu, Ben; Coslo, Denise M; Albert, Istvan; Han, Xing; Omiecinski, Curtis J

    2016-09-01

    The constitutive androstane receptor (CAR; NR1I3) is a member of the nuclear receptor superfamily that functions as a xenosensor, serving to regulate xenobiotic detoxification, lipid homeostasis and energy metabolism. CAR activation is also a key contributor to the development of chemical hepatocarcinogenesis in mice. The underlying pathways affected by CAR in these processes are complex and not fully elucidated. MicroRNAs (miRNAs) have emerged as critical modulators of gene expression and appear to impact many cellular pathways, including those involved in chemical detoxification and liver tumor development. In this study, we used deep sequencing approaches with an Illumina HiSeq platform to differentially profile microRNA expression patterns in livers from wild type C57BL/6J mice following CAR activation with the mouse CAR-specific ligand activator, 1,4-bis-[2-(3,5,-dichloropyridyloxy)] benzene (TCPOBOP). Bioinformatic analyses and pathway evaluations were performed leading to the identification of 51 miRNAs whose expression levels were significantly altered by TCPOBOP treatment, including mmu-miR-802-5p and miR-485-3p. Ingenuity Pathway Analysis of the differentially expressed microRNAs revealed altered effector pathways, including those involved in liver cell growth and proliferation. A functional network among CAR targeted genes and the affected microRNAs was constructed to illustrate how CAR modulation of microRNA expression may potentially mediate its biological role in mouse hepatocyte proliferation. This article is part of a Special Issue entitled: Xenobiotic nuclear receptors: New Tricks for An Old Dog, edited by Dr. Wen Xie. PMID:27080131

  6. Bioinformatic analysis of microRNA networks following the activation of the constitutive androstane receptor (CAR) in mouse liver.

    Science.gov (United States)

    Hao, Ruixin; Su, Shengzhong; Wan, Yinan; Shen, Frank; Niu, Ben; Coslo, Denise M; Albert, Istvan; Han, Xing; Omiecinski, Curtis J

    2016-09-01

    The constitutive androstane receptor (CAR; NR1I3) is a member of the nuclear receptor superfamily that functions as a xenosensor, serving to regulate xenobiotic detoxification, lipid homeostasis and energy metabolism. CAR activation is also a key contributor to the development of chemical hepatocarcinogenesis in mice. The underlying pathways affected by CAR in these processes are complex and not fully elucidated. MicroRNAs (miRNAs) have emerged as critical modulators of gene expression and appear to impact many cellular pathways, including those involved in chemical detoxification and liver tumor development. In this study, we used deep sequencing approaches with an Illumina HiSeq platform to differentially profile microRNA expression patterns in livers from wild type C57BL/6J mice following CAR activation with the mouse CAR-specific ligand activator, 1,4-bis-[2-(3,5,-dichloropyridyloxy)] benzene (TCPOBOP). Bioinformatic analyses and pathway evaluations were performed leading to the identification of 51 miRNAs whose expression levels were significantly altered by TCPOBOP treatment, including mmu-miR-802-5p and miR-485-3p. Ingenuity Pathway Analysis of the differentially expressed microRNAs revealed altered effector pathways, including those involved in liver cell growth and proliferation. A functional network among CAR targeted genes and the affected microRNAs was constructed to illustrate how CAR modulation of microRNA expression may potentially mediate its biological role in mouse hepatocyte proliferation. This article is part of a Special Issue entitled: Xenobiotic nuclear receptors: New Tricks for An Old Dog, edited by Dr. Wen Xie.

  7. Integrating subpathway analysis to identify candidate agents for hepatocellular carcinoma.

    Science.gov (United States)

    Wang, Jiye; Li, Mi; Wang, Yun; Liu, Xiaoping

    2016-01-01

    Hepatocellular carcinoma (HCC) is the second most common cause of cancer-associated death worldwide, characterized by a high invasiveness and resistance to normal anticancer treatments. The need to develop new therapeutic agents for HCC is urgent. Here, we developed a bioinformatics method to identify potential novel drugs for HCC by integrating HCC-related and drug-affected subpathways. By using the RNA-seq data from the TCGA (The Cancer Genome Atlas) database, we first identified 1,763 differentially expressed genes between HCC and normal samples. Next, we identified 104 significant HCC-related subpathways. We also identified the subpathways associated with small molecular drugs in the CMap database. Finally, by integrating HCC-related and drug-affected subpathways, we identified 40 novel small molecular drugs capable of targeting these HCC-involved subpathways. In addition to previously reported agents (ie, calmidazolium), our method also identified potentially novel agents for targeting HCC. We experimentally verified that one of these novel agents, prenylamine, induced HCC cell apoptosis using 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide, an acridine orange/ethidium bromide stain, and electron microscopy. In addition, we found that prenylamine not only affected several classic apoptosis-related proteins, including Bax, Bcl-2, and cytochrome c, but also increased caspase-3 activity. These candidate small molecular drugs identified by us may provide insights into novel therapeutic approaches for HCC. PMID:27022281

  8. Effect of phosphatidylcholine on the level expression of plc genes of Aspergillus fumigatus by real time PCR method and investigation of these genes using bioinformatics analysis.

    Directory of Open Access Journals (Sweden)

    Ali Dehghan-Noodeh

    2014-04-01

    Full Text Available Phosphlipases are a group of enzymes that breakdown phosphatidylcholine (phospholipids molecules producing second products. These produced products have a divers role in the cell like signal transduction and digestion in humans. In this research the effect of phosphatidylcholine on the expression of plc genes of A. fumigatus was studied. The plc genes of this fungus were also interrogated using bioinformatics studies.Real-time PCR was performed to study the expression of plc genes and these genes were interrogated using bioinformatics studies.There was more significant expression for all three plc genes when A. fumigatus was grown on the presence of phosphatidylcholine in the medium. The sequence of plc genes of A. fumigatus was also interrogated using bioinformatics analysis and their relationship with the other microorganisms was investigated.Real-time PCR revealed that afplc1, afplc2 and afplc3 were up-regulated in the presence of phosphatidylcholine. In this study we suggest either the plc's of A. fumigatus were present in an ancestral genome and have become lost in some lineages, or that they have been acquired from other organisms by horizontal gene transfer. We also found that plc's of this fungus appeared to be more closely related to the plant plc's than the bacterial plc's.

  9. Bioinformatics for Genome Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Gary J. Olsen

    2005-06-30

    Nesbo, Boucher and Doolittle (2001) used phylogenetic trees of four taxa to assess whether euryarchaeal genes share a common history. They have suggested that of the 521 genes examined, each of the three possible tree topologies relating the four taxa was supported essentially equal numbers of times. They suggest that this might be the result of numerous horizontal gene transfer events, essentially randomizing the relationships between gene histories (as inferred in the 521 gene trees) and organismal relationships (which would be a single underlying tree). Motivated by the fact that the order in which sequences are added to a multiple sequence alignment influences the alignment, and ultimately inferred tree, they were interested in the extent to which the variations among inferred trees might be due to variations in the alignment order. This bears directly on their efforts to evaluate and improve upon methods of multiple sequence alignment. They set out to analyze the influence of alignment order on the tree inferred for 43 genes shared among these same 4 taxa. Because alignments produced by CLUSTALW are directed by a rooted guide tree (the denderogram), there are 15 possible alignment orders of 4 taxa. For each gene they tested all 15 alignment orders, and as a 16th option, allowed CLUSTALW to generate its own guide tree. If we supply all 15 possible rooted guide trees, they expected that at least one of them should be as good at CLUSTAL's own guide tree, but most of the time they differed (sometimes being better than CLUSTAL's default tree and sometimes being worse). The difference seems to be that the user-supplied tree is not given meaningful branch lengths, which effect the assumed probability of amino acid changes. They examined the practicality of modifying CLUSTALW to improve its treatment of user-supplied guide trees. This work became ever increasing bogged down in finding and repairing minor bugs in the CLUSTALW code. This effort was put on hold as we feel that our other proposed approaches will ultimately be better.

  10. Characterizing the porcine transcriptional regulatory response to infection by Salmonella: identifying putative new NFkB direct targets through comparative bioinformatics.

    Science.gov (United States)

    We have collected data on host response to infection from RNA prepared from mesenteric lymph node of swine infected with either Salmonella enterica serovar Typhimurium (ST) or S. Choleraesuis (SC) using the porcine Affymetrix GeneChip. We identified 848 (ST) and 1,853 (SC) genes with statistical evi...

  11. Mapping the Transcriptome-Wide Landscape of RBP Binding Sites Using gPAR-CLIP-seq: Bioinformatic Analysis.

    Science.gov (United States)

    Freeberg, Mallory A; Kim, John K

    2016-01-01

    Protein-RNA interactions are integral components of posttranscriptional gene regulatory processes including mRNA processing and assembly of cellular architectures. Dysregulation of RNA-binding protein (RBP) expression or disruptions in RBP-RNA interactions underlie a variety of human pathologies and genetic diseases including cancer and neurodegenerative diseases (reviewed in (Cooper et al., Cell 136(4):777-793, 2009; Darnell, Cancer Res Treat 42(3):125-129, 2010; Lukong et al., Trends Genet 24 (8):416-425, 2008)). Recent studies have uncovered only a small proportion of the extensive RBP-RNA interactome in any organism (Baltz et al., Mol Cell 46(5):674-690, 2012; Castello et al., Cell 149(6):1393-1406, 2012; Freeberg et al., Genome Biol 14(2):R13, 2013; Hogan et al., PLoS Biol 6(10):e255, 2008; Mitchell et al., Nat Struct Mol Biol 20(1):127-133, 2013; Tsvetanova et al. PLoS One 5(9): pii: e12671, 2010; Schueler et al., Genome Biol 15(1):R15, 2014; Silverman et al., Genome Biol 15(1):R3, 2014). To expand our understanding of how RBP-RNA interactions govern RNA-related processes, we developed gPAR-CLIP-seq (global photoactivatable-ribonucleoside-enhanced cross-linking and precipitation followed by deep sequencing) for capturing and sequencing all regions of the Saccharomyces cerevisiae transcriptome bound by RBPs (Freeberg et al., Genome Biol 14(2):R13, 2013). This chapter describes a pipeline for bioinformatic analysis of gPAR-CLIP-seq data. The first half of this pipeline can be implemented by running locally installed programs or by running the programs using the Galaxy platform (Blankenberg et al., Curr Protoc Mol Biol. Chapter 19:Unit 19 10 11-21, 2010; Giardine et al., Genome Res 15 (10):1451-1455, 2005; Goecks et al., Genome Biol 11(8):R86, 2010). The second half of this pipeline can be implemented by user-generated code in any language using the pseudocode provided as a template. PMID:26483018

  12. Identification of microRNAs from Amur grape (vitis amurensis Rupr. by deep sequencing and analysis of microRNA variations with bioinformatics

    Directory of Open Access Journals (Sweden)

    Wang Chen

    2012-03-01

    Full Text Available Abstract Background MicroRNA (miRNA is a class of functional non-coding small RNA with 19-25 nucleotides in length while Amur grape (Vitis amurensis Rupr. is an important wild fruit crop with the strongest cold resistance among the Vitis species, is used as an excellent breeding parent for grapevine, and has elicited growing interest in wine production. To date, there is a relatively large number of grapevine miRNAs (vv-miRNAs from cultivated grapevine varieties such as Vitis vinifera L. and hybrids of V. vinifera and V. labrusca, but there is no report on miRNAs from Vitis amurensis Rupr, a wild grapevine species. Results A small RNA library from Amur grape was constructed and Solexa technology used to perform deep sequencing of the library followed by subsequent bioinformatics analysis to identify new miRNAs. In total, 126 conserved miRNAs belonging to 27 miRNA families were identified, and 34 known but non-conserved miRNAs were also found. Significantly, 72 new potential Amur grape-specific miRNAs were discovered. The sequences of these new potential va-miRNAs were further validated through miR-RACE, and accumulation of 18 new va-miRNAs in seven tissues of grapevines confirmed by real time RT-PCR (qRT-PCR analysis. The expression levels of va-miRNAs in flowers and berries were found to be basically consistent in identity to those from deep sequenced sRNAs libraries of combined corresponding tissues. We also describe the conservation and variation of va-miRNAs using miR-SNPs and miR-LDs during plant evolution based on comparison of orthologous sequences, and further reveal that the number and sites of miR-SNP in diverse miRNA families exhibit distinct divergence. Finally, 346 target genes for the new miRNAs were predicted and they include a number of Amur grape stress tolerance genes and many genes regulating anthocyanin synthesis and sugar metabolism. Conclusions Deep sequencing of short RNAs from Amur grape flowers and berries identified 72

  13. Chemistry in Bioinformatics

    OpenAIRE

    Mitchell John; Murray-Rust Peter; Rzepa Henry

    2005-01-01

    Abstract Chemical information is now seen as critical for most areas of life sciences. But unlike Bioinformatics, where data is openly available and freely re-usable, most chemical information is closed and cannot be re-distributed without permission. This has led to a failure to adopt modern informatics and software techniques and therefore paucity of chemistry in bioinformatics. New technology, however, offers the hope of making chemical data (compounds and properties) free during the auth...

  14. Integrative Bioinformatics for Genomics and Proteomics

    OpenAIRE

    Wu, C.H.

    2011-01-01

    Systems integration is becoming the driving force for 21st century biology. Researchers are systematically tackling gene functions and complex regulatory processes by studying organisms at different levels of organization, from genomes and transcriptomes to proteomes and interactomes. To fully realize the value of such high-throughput data requires advanced bioinformatics for integration, mining, comparative analysis, and functional interpretation. We are developing a bioinformatics research ...

  15. SNPTrack™ : an integrated bioinformatics system for genetic association studies.

    Science.gov (United States)

    Xu, Joshua; Kelly, Reagan; Zhou, Guangxu; Turner, Steven A; Ding, Don; Harris, Stephen C; Hong, Huixiao; Fang, Hong; Tong, Weida

    2012-01-01

    A genetic association study is a complicated process that involves collecting phenotypic data, generating genotypic data, analyzing associations between genotypic and phenotypic data, and interpreting genetic biomarkers identified. SNPTrack is an integrated bioinformatics system developed by the US Food and Drug Administration (FDA) to support the review and analysis of pharmacogenetics data resulting from FDA research or submitted by sponsors. The system integrates data management, analysis, and interpretation in a single platform for genetic association studies. Specifically, it stores genotyping data and single-nucleotide polymorphism (SNP) annotations along with study design data in an Oracle database. It also integrates popular genetic analysis tools, such as PLINK and Haploview. SNPTrack provides genetic analysis capabilities and captures analysis results in its database as SNP lists that can be cross-linked for biological interpretation to gene/protein annotations, Gene Ontology, and pathway analysis data. With SNPTrack, users can do the entire stream of bioinformatics jobs for genetic association studies. SNPTrack is freely available to the public at http://www.fda.gov/ScienceResearch/BioinformaticsTools/SNPTrack/default.htm. PMID:23245293

  16. A Bioinformatics Facility for NASA

    Science.gov (United States)

    Schweighofer, Karl; Pohorille, Andrew

    2006-01-01

    Building on an existing prototype, we have fielded a facility with bioinformatics technologies that will help NASA meet its unique requirements for biological research. This facility consists of a cluster of computers capable of performing computationally intensive tasks, software tools, databases and knowledge management systems. Novel computational technologies for analyzing and integrating new biological data and already existing knowledge have been developed. With continued development and support, the facility will fulfill strategic NASA s bioinformatics needs in astrobiology and space exploration. . As a demonstration of these capabilities, we will present a detailed analysis of how spaceflight factors impact gene expression in the liver and kidney for mice flown aboard shuttle flight STS-108. We have found that many genes involved in signal transduction, cell cycle, and development respond to changes in microgravity, but that most metabolic pathways appear unchanged.

  17. Bioinformatics resource manager v2.3: an integrated software environment for systems biology with microRNA and cross-species analysis tools

    Directory of Open Access Journals (Sweden)

    Tilton Susan C

    2012-11-01

    Full Text Available Abstract Background MicroRNAs (miRNAs are noncoding RNAs that direct post-transcriptional regulation of protein coding genes. Recent studies have shown miRNAs are important for controlling many biological processes, including nervous system development, and are highly conserved across species. Given their importance, computational tools are necessary for analysis, interpretation and integration of high-throughput (HTP miRNA data in an increasing number of model species. The Bioinformatics Resource Manager (BRM v2.3 is a software environment for data management, mining, integration and functional annotation of HTP biological data. In this study, we report recent updates to BRM for miRNA data analysis and cross-species comparisons across datasets. Results BRM v2.3 has the capability to query predicted miRNA targets from multiple databases, retrieve potential regulatory miRNAs for known genes, integrate experimentally derived miRNA and mRNA datasets, perform ortholog mapping across species, and retrieve annotation and cross-reference identifiers for an expanded number of species. Here we use BRM to show that developmental exposure of zebrafish to 30 uM nicotine from 6–48 hours post fertilization (hpf results in behavioral hyperactivity in larval zebrafish and alteration of putative miRNA gene targets in whole embryos at developmental stages that encompass early neurogenesis. We show typical workflows for using BRM to integrate experimental zebrafish miRNA and mRNA microarray datasets with example retrievals for zebrafish, including pathway annotation and mapping to human ortholog. Functional analysis of differentially regulated (p Conclusions BRM provides the ability to mine complex data for identification of candidate miRNAs or pathways that drive phenotypic outcome and, therefore, is a useful hypothesis generation tool for systems biology. The miRNA workflow in BRM allows for efficient processing of multiple miRNA and mRNA datasets in a single

  18. GProX, a User-Friendly Platform for Bioinformatics Analysis and Visualization of Quantitative Proteomics Data

    OpenAIRE

    Rigbolt, K. T. G.; Vanselow, J. T.; Blagoev, B.

    2011-01-01

    Recent technological advances have made it possible to identify and quantify thousands of proteins in a single proteomics experiment. As a result of these developments, the analysis of data has become the bottleneck of proteomics experiment. To provide the proteomics community with a user-friendly platform for comprehensive analysis, inspection and visualization of quantitative proteomics data we developed the Graphical Proteomics Data Explorer (GProX)1. The program requires no special bioinf...

  19. Microbial bioinformatics 2020.

    Science.gov (United States)

    Pallen, Mark J

    2016-09-01

    Microbial bioinformatics in 2020 will remain a vibrant, creative discipline, adding value to the ever-growing flood of new sequence data, while embracing novel technologies and fresh approaches. Databases and search strategies will struggle to cope and manual curation will not be sustainable during the scale-up to the million-microbial-genome era. Microbial taxonomy will have to adapt to a situation in which most microorganisms are discovered and characterised through the analysis of sequences. Genome sequencing will become a routine approach in clinical and research laboratories, with fresh demands for interpretable user-friendly outputs. The "internet of things" will penetrate healthcare systems, so that even a piece of hospital plumbing might have its own IP address that can be integrated with pathogen genome sequences. Microbiome mania will continue, but the tide will turn from molecular barcoding towards metagenomics. Crowd-sourced analyses will collide with cloud computing, but eternal vigilance will be the price of preventing the misinterpretation and overselling of microbial sequence data. Output from hand-held sequencers will be analysed on mobile devices. Open-source training materials will address the need for the development of a skilled labour force. As we boldly go into the third decade of the twenty-first century, microbial sequence space will remain the final frontier! PMID:27471065

  20. [Bioinformatics-based Design of Peptide Vaccine Candidates Targeting Spike Protein of MERS-CoV and Immunity analysis in Mice].

    Science.gov (United States)

    Lan, Jiaming; Lu, Shuai; Deng, Yao; Wen, Bo; Chen, Hong; Wang, Wen; Tan, Wenjie

    2016-01-01

    Middle East respiratory syndrome coronavirus (MERS-CoV) was identified as a novel human coronavirus and posed great threat to public health world wide,which calls for the development of effective and safe vaccine urgently. In the study, peptide epitopes tagrgeting spike antigen were predicted based on bioinformatics methods. Nine polypeptides with high scores were synthesized and linked to keyhole limpet hemocyanin (KLH). Female BALB/C mice were immunized with individual polypeptide-KLH, and the total IgG was detected by ELISA as well as the cellular mediated immunity (CMI) was analyzed using ELIs-pot assay. The results showed that an individual peptide of YVDVGPDSVKSACIEVDIQQTFFDKTWPRPIDVSKADGI could induce the highest level of total IgG as well as CMI (high frequency of IFN-γ secretion) against MERS-CoV antigen in mice. Our study identified a promising peptide vaccine candidate against MERS-CoV and provided an experimental support for bioinformatics-based design of peptide vaccine.

  1. Identifiable Data Files - Medicare Provider Analysis and ...

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Medicare Provider Analysis and Review (MEDPAR) File contains data from claims for services provided to beneficiaries admitted to Medicare certified inpatient...

  2. A Bioinformatics Analysis Reveals a Group of MocR Bacterial Transcriptional Regulators Linked to a Family of Genes Coding for Membrane Proteins

    Directory of Open Access Journals (Sweden)

    Teresa Milano

    2016-01-01

    Full Text Available The MocR bacterial transcriptional regulators are characterized by an N-terminal domain, 60 residues long on average, possessing the winged-helix-turn-helix (wHTH architecture responsible for DNA recognition and binding, linked to a large C-terminal domain (350 residues on average that is homologous to fold type-I pyridoxal 5′-phosphate (PLP dependent enzymes like aspartate aminotransferase (AAT. These regulators are involved in the expression of genes taking part in several metabolic pathways directly or indirectly connected to PLP chemistry, many of which are still uncharacterized. A bioinformatics analysis is here reported that studied the features of a distinct group of MocR regulators predicted to be functionally linked to a family of homologous genes coding for integral membrane proteins of unknown function. This group occurs mainly in the Actinobacteria and Gammaproteobacteria phyla. An analysis of the multiple sequence alignments of their wHTH and AAT domains suggested the presence of specificity-determining positions (SDPs. Mapping of SDPs onto a homology model of the AAT domain hinted at possible structural/functional roles in effector recognition. Likewise, SDPs in wHTH domain suggested the basis of specificity of Transcription Factor Binding Site recognition. The results reported represent a framework for rational design of experiments and for bioinformatics analysis of other MocR subgroups.

  3. A Bioinformatics Analysis Reveals a Group of MocR Bacterial Transcriptional Regulators Linked to a Family of Genes Coding for Membrane Proteins

    Science.gov (United States)

    Milano, Teresa

    2016-01-01

    The MocR bacterial transcriptional regulators are characterized by an N-terminal domain, 60 residues long on average, possessing the winged-helix-turn-helix (wHTH) architecture responsible for DNA recognition and binding, linked to a large C-terminal domain (350 residues on average) that is homologous to fold type-I pyridoxal 5′-phosphate (PLP) dependent enzymes like aspartate aminotransferase (AAT). These regulators are involved in the expression of genes taking part in several metabolic pathways directly or indirectly connected to PLP chemistry, many of which are still uncharacterized. A bioinformatics analysis is here reported that studied the features of a distinct group of MocR regulators predicted to be functionally linked to a family of homologous genes coding for integral membrane proteins of unknown function. This group occurs mainly in the Actinobacteria and Gammaproteobacteria phyla. An analysis of the multiple sequence alignments of their wHTH and AAT domains suggested the presence of specificity-determining positions (SDPs). Mapping of SDPs onto a homology model of the AAT domain hinted at possible structural/functional roles in effector recognition. Likewise, SDPs in wHTH domain suggested the basis of specificity of Transcription Factor Binding Site recognition. The results reported represent a framework for rational design of experiments and for bioinformatics analysis of other MocR subgroups. PMID:27446613

  4. Identifying MMORPG Bots: A Traffic Analysis Approach

    Directory of Open Access Journals (Sweden)

    Wen-Chin Chen

    2008-11-01

    Full Text Available Massively multiplayer online role playing games (MMORPGs have become extremely popular among network gamers. Despite their success, one of MMORPG's greatest challenges is the increasing use of game bots, that is, autoplaying game clients. The use of game bots is considered unsportsmanlike and is therefore forbidden. To keep games in order, game police, played by actual human players, often patrol game zones and question suspicious players. This practice, however, is labor-intensive and ineffective. To address this problem, we analyze the traffic generated by human players versus game bots and propose general solutions to identify game bots. Taking Ragnarok Online as our subject, we study the traffic generated by human players and game bots. We find that their traffic is distinguishable by 1 the regularity in the release time of client commands, 2 the trend and magnitude of traffic burstiness in multiple time scales, and 3 the sensitivity to different network conditions. Based on these findings, we propose four strategies and two ensemble schemes to identify bots. Finally, we discuss the robustness of the proposed methods against countermeasures of bot developers, and consider a number of possible ways to manage the increasingly serious bot problem.

  5. Identifying MMORPG Bots: A Traffic Analysis Approach

    Science.gov (United States)

    Chen, Kuan-Ta; Jiang, Jhih-Wei; Huang, Polly; Chu, Hao-Hua; Lei, Chin-Laung; Chen, Wen-Chin

    2008-12-01

    Massively multiplayer online role playing games (MMORPGs) have become extremely popular among network gamers. Despite their success, one of MMORPG's greatest challenges is the increasing use of game bots, that is, autoplaying game clients. The use of game bots is considered unsportsmanlike and is therefore forbidden. To keep games in order, game police, played by actual human players, often patrol game zones and question suspicious players. This practice, however, is labor-intensive and ineffective. To address this problem, we analyze the traffic generated by human players versus game bots and propose general solutions to identify game bots. Taking Ragnarok Online as our subject, we study the traffic generated by human players and game bots. We find that their traffic is distinguishable by 1) the regularity in the release time of client commands, 2) the trend and magnitude of traffic burstiness in multiple time scales, and 3) the sensitivity to different network conditions. Based on these findings, we propose four strategies and two ensemble schemes to identify bots. Finally, we discuss the robustness of the proposed methods against countermeasures of bot developers, and consider a number of possible ways to manage the increasingly serious bot problem.

  6. Combination of meta-analysis and graph clustering to identify prognostic markers of ESCC

    Directory of Open Access Journals (Sweden)

    Hongyun Gao

    2012-01-01

    Full Text Available Esophageal squamous cell carcinoma (ESCC is one of the most malignant gastrointestinal cancers and occurs at a high frequency rate in China and other Asian countries. Recently, several molecular markers were identified for predicting ESCC. Notwithstanding, additional prognostic markers, with a clear understanding of their underlying roles, are still required. Through bioinformatics, a graph-clustering method by DPClus was used to detect co-expressed modules. The aim was to identify a set of discriminating genes that could be used for predicting ESCC through graph-clustering and GO-term analysis. The results showed that CXCL12, CYP2C9, TGM3, MAL, S100A9, EMP-1 and SPRR3 were highly associated with ESCC development. In our study, all their predicted roles were in line with previous reports, whereby the assumption that a combination of meta-analysis, graph-clustering and GO-term analysis is effective for both identifying differentially expressed genes, and reflecting on their functions in ESCC.

  7. Combination of meta-analysis and graph clustering to identify prognostic markers of ESCC.

    Science.gov (United States)

    Gao, Hongyun; Wang, Lishan; Cui, Shitao; Wang, Mingsong

    2012-04-01

    Esophageal squamous cell carcinoma (ESCC) is one of the most malignant gastrointestinal cancers and occurs at a high frequency rate in China and other Asian countries. Recently, several molecular markers were identified for predicting ESCC. Notwithstanding, additional prognostic markers, with a clear understanding of their underlying roles, are still required. Through bioinformatics, a graph-clustering method by DPClus was used to detect co-expressed modules. The aim was to identify a set of discriminating genes that could be used for predicting ESCC through graph-clustering and GO-term analysis. The results showed that CXCL12, CYP2C9, TGM3, MAL, S100A9, EMP-1 and SPRR3 were highly associated with ESCC development. In our study, all their predicted roles were in line with previous reports, whereby the assumption that a combination of meta-analysis, graph-clustering and GO-term analysis is effective for both identifying differentially expressed genes, and reflecting on their functions in ESCC.

  8. Comparative transcriptional pathway bioinformatic analysis of dietary restriction, Sir2, p53 and resveratrol life span extension in Drosophila

    OpenAIRE

    Antosh, Michael; Whitaker, Rachel; Kroll, Adam; Hosier, Suzanne; Chang, Chengyi; Bauer, Johannes; Cooper, Leon; Neretti, Nicola; HELFAND, STEPHEN L.

    2011-01-01

    A multiple comparison approach using whole genome transcriptional arrays was used to identify genes and pathways involved in calorie restriction/dietary restriction (DR) life span extension in Drosophila. Starting with a gene centric analysis comparing the changes in common between DR and two DR related molecular genetic life span extending manipulations, Sir2 and p53, lead to a molecular confirmation of Sir2 and p53's similarity with DR and the identification of a small set of commonly regul...

  9. Bioinformatics methods for identifying candidate disease genes

    NARCIS (Netherlands)

    Driel, M.A. van; Brunner, H.G.

    2006-01-01

    With the explosion in genomic and functional genomics information, methods for disease gene identification are rapidly evolving. Databases are now essential to the process of selecting candidate disease genes. Combining positional information with disease characteristics and functional information i

  10. Bioinformatics and School Biology

    Science.gov (United States)

    Dalpech, Roger

    2006-01-01

    The rapidly changing field of bioinformatics is fuelling the need for suitably trained personnel with skills in relevant biological "sub-disciplines" such as proteomics, transcriptomics and metabolomics, etc. But because of the complexity--and sheer weight of data--associated with these new areas of biology, many school teachers feel…

  11. Adapting bioinformatics curricula for big data.

    Science.gov (United States)

    Greene, Anna C; Giffin, Kristine A; Greene, Casey S; Moore, Jason H

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs.

  12. Clinical significance of overexpression of metastasis-associated gene MTA1 in cervical cancer and bioinformatic analysis of genes coordinately expressed with MTA1

    Directory of Open Access Journals (Sweden)

    Shu-ying FAN

    2016-06-01

    Full Text Available Objective  To analyze the clinical significance of MTA1 overexpression in cervical cancer and bioinformatically screen the potential treatment targets from the gene network correlated with MTA1 overexpression. Methods  SPSS software package was used to analyze the correlation of MTA1 with clinical metastasis and pathological grade of cervical cancer based on TCGA-CESC data set. The edgeR software was used to screen the gene set whose expression was correlated with MTA1 in cervical cancer at a global transcriptional level. DAVID platform was adopted to identify the enriched biological functions of the gene set significantly correlated with MTA1 expression. The transcriptional regulation network of the gene set was constructed with STRING online platform and Cytospace softwares to identify the key regulators. Results  TCGA-CESC database assay showed a significant positive correlation of MTA1 expression with clinical metastasis of cervical cancer (P<0.01. There was a gene set in which gene expression was closely correlated with MTA1 level. Functional enrichment of the gene set indicated that cancer pathways, stem cell pathways, cell migration, cell differentiation, etc. were closely linked to MTA1-correlated malignant behaviors of cancers. Bioinformatical screening showed that Agt, Acta1, Fpr2, Pmch and RGS18, which are correlated with MTA1 expression in cervical cancer, were the key regulators in differentially expressed gene sets. And these genes were located to the GPCR pathway. Conclusions  MTA1 overexpression is significantly correlated with clinical metastasis of cervical cancer and paralleled with the activation of gene regulation involved in stem cell pathway, cytokine receptor signaling, cell migration and differentiation pathways. These genes are correlated with MTA1 expression and potential treatment targets in cervical cancer and should be further experimentally evaluated in the future. DOI: 10.11855/j.issn.0577-7402.2016.05.03

  13. Bioinformatic analysis of the non-structural protein 1 of type 2 dengue virus%登革2型病毒非结构蛋白NS1的生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    齐一鸣; 黄俊琪

    2011-01-01

    目的:分析登革2型病毒非结构蛋白NS1的结构和功能特征并预测其优势抗原表位.方法:利用NCBI、CBS等生物信息学网站和DNAStar、Vector NTI等软件包,分析登革2型病毒NS1的理化性质和结构与功能特征,及可能的空间结构和抗原表位.结果:NS1基因编码352个氨基酸,含12个保守的半胱氨酸.脂质含量相对较多,理化性质不稳定.无分泌型信号肽及跨膜结构,但存在多个糖基化、磷酸化、酰胺化位点.空间结构为一紧凑球形,N端和C端暴露于球体表面,线性B细胞抗原表位的区域较为密集.中段包埋于分子内部,但含有一些与血小板、血管内皮或纤维蛋白素原高度同源的B细胞表位序列,可能在登革出血热的病理过程中发挥重要作用.结论:NS1不仅是一个极具潜力的诊断性抗原,其抗原表位的预测将为登革病毒表位多肽疫苗的开发提供依据.%Objective Predict the structural and functional characteristics of the non-structural protein 1 (NS1) of dengue virus 2, as well as the predominant antigen epitope, by bioinformatics analysis in order to guide the experimental research on its biological function and application. Methods Utilizing the analysis tools provided by NCBI, CBS bioinformatics web sites and combination of bioinformatics software packages , such as DNAStar, Vector NTI, to identify the characteristics of NS1. Results The NS1 gene coding 352 amino acids which include 12 conservative cysteines. It carries no signal peptide in the N terminus and no transmembrane regions but with instable physico-chemical characteristics.The protein comprises of only one compact globular domain in the protein with both of the N-terminnus and C-terminnus fragment exposed on the surface where linear B cell epitopes are possibly intensive. Although embed internal sterically, it is found that some epitopes are highly cognated with thromboplastid and fibrinogen by blast analysis. Deduced conformational

  14. Privacy Preserving PCA on Distributed Bioinformatics Datasets

    Science.gov (United States)

    Li, Xin

    2011-01-01

    In recent years, new bioinformatics technologies, such as gene expression microarray, genome-wide association study, proteomics, and metabolomics, have been widely used to simultaneously identify a huge number of human genomic/genetic biomarkers, generate a tremendously large amount of data, and dramatically increase the knowledge on human…

  15. Bioinformatics analysis of potential essential genes that response to the high intraocular pressure on astrocyte due to glaucoma

    Institute of Scientific and Technical Information of China (English)

    Yang; Yang; Jing-Zhu; Duan; Yu; Di; Dong-Mei; Gui; Dian-Wen; Gao

    2015-01-01

    AIM: To study the gene expression response and predict the network in cell due to pressure effects on optic nerve injury of glaucoma.METHODS: We used glaucoma related microarray data in public database [Gene Expression Omnibus(GEO)] to explore the potential gene expression changes as well as correspondent biological process alterations due to increased pressure in astrocytes during glaucoma development.RESULTS: A total of six genes were identified to be related with pressure increasing. Through the annotation and network analysis, we found these genes might be involved in cell morphological remodeling, angiogenesis,mismatch repair.CONCLUSION: Increasing pressure in glaucoma on astrocytes might cause gene expression alterations,which might induce some cellular responses changes.

  16. Identification of microRNAs in the Toxigenic Dinoflagellate Alexandrium catenella by High-Throughput Illumina Sequencing and Bioinformatic Analysis.

    Directory of Open Access Journals (Sweden)

    Huili Geng

    Full Text Available Micro-ribonucleic acids (miRNAs are a large group of endogenous, tiny, non-coding RNAs consisting of 19-25 nucleotides that regulate gene expression at either the transcriptional or post-transcriptional level by mediating gene silencing in eukaryotes. They are considered to be important regulators that affect growth, development, and response to various stresses in plants. Alexandrium catenella is an important marine toxic phytoplankton species that can cause harmful algal blooms (HABs. To date, identification and function analysis of miRNAs in A. catenella remain largely unexamined. In this study, high-throughput sequencing was performed on A. catenella to identify and quantitatively profile the repertoire of small RNAs from two different growth phases. A total of 38,092,056 and 32,969,156 raw reads were obtained from the two small RNA libraries, respectively. In total, 88 mature miRNAs belonging to 32 miRNA families were identified. Significant differences were found in the member number, expression level of various families, and expression abundance of each member within a family. A total of 15 potentially novel miRNAs were identified. Comparative profiling showed that 12 known miRNAs exhibited differential expression between the lag phase and the logarithmic phase. Real-time quantitative RT-PCR (qPCR was performed to confirm the expression of two differentially expressed miRNAs that were one up-regulated novel miRNA (aca-miR-3p-456915, and one down-regulated conserved miRNA (tae-miR159a. The expression trend of the qPCR assay was generally consistent with the deep sequencing result. Target predictions of the 12 differentially expressed miRNAs resulted in 1813 target genes. Gene ontology (GO analysis and the Kyoto Encyclopedia of Genes and Genomes pathway database (KEGG annotations revealed that some miRNAs were associated with growth and developmental processes of the alga. These results provide insights into the roles that miRNAs play in

  17. Best practices in bioinformatics training for life scientists.

    Science.gov (United States)

    Via, Allegra; Blicher, Thomas; Bongcam-Rudloff, Erik; Brazas, Michelle D; Brooksbank, Cath; Budd, Aidan; De Las Rivas, Javier; Dreyer, Jacqueline; Fernandes, Pedro L; van Gelder, Celia; Jacob, Joachim; Jimenez, Rafael C; Loveland, Jane; Moran, Federico; Mulder, Nicola; Nyrönen, Tommi; Rother, Kristian; Schneider, Maria Victoria; Attwood, Teresa K

    2013-09-01

    The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists. PMID:23803301

  18. Best practices in bioinformatics training for life scientists.

    Science.gov (United States)

    Via, Allegra; Blicher, Thomas; Bongcam-Rudloff, Erik; Brazas, Michelle D; Brooksbank, Cath; Budd, Aidan; De Las Rivas, Javier; Dreyer, Jacqueline; Fernandes, Pedro L; van Gelder, Celia; Jacob, Joachim; Jimenez, Rafael C; Loveland, Jane; Moran, Federico; Mulder, Nicola; Nyrönen, Tommi; Rother, Kristian; Schneider, Maria Victoria; Attwood, Teresa K

    2013-09-01

    The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists.

  19. Best practices in bioinformatics training for life scientists.

    KAUST Repository

    Via, Allegra

    2013-06-25

    The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists.

  20. Protein functional links in Trypanosoma brucei, identified by gene fusion analysis

    Directory of Open Access Journals (Sweden)

    Trimpalis Philip

    2011-07-01

    Full Text Available Abstract Background Domain or gene fusion analysis is a bioinformatics method for detecting gene fusions in one organism by comparing its genome to that of other organisms. The occurrence of gene fusions suggests that the two original genes that participated in the fusion are functionally linked, i.e. their gene products interact either as part of a multi-subunit protein complex, or in a metabolic pathway. Gene fusion analysis has been used to identify protein functional links in prokaryotes as well as in eukaryotic model organisms, such as yeast and Drosophila. Results In this study we have extended this approach to include a number of recently sequenced protists, four of which are pathogenic, to identify fusion linked proteins in Trypanosoma brucei, the causative agent of African sleeping sickness. We have also examined the evolution of the gene fusion events identified, to determine whether they can be attributed to fusion or fission, by looking at the conservation of the fused genes and of the individual component genes across the major eukaryotic and prokaryotic lineages. We find relatively limited occurrence of gene fusions/fissions within the protist lineages examined. Our results point to two trypanosome-specific gene fissions, which have recently been experimentally confirmed, one fusion involving proteins involved in the same metabolic pathway, as well as two novel putative functional links between fusion-linked protein pairs. Conclusions This is the first study of protein functional links in T. brucei identified by gene fusion analysis. We have used strict thresholds and only discuss results which are highly likely to be genuine and which either have already been or can be experimentally verified. We discuss the possible impact of the identification of these novel putative protein-protein interactions, to the development of new trypanosome therapeutic drugs.

  1. Bioinformatics analysis and prediction for structure and function of nitric oxide synthase and similar proteins from Plasmodium berghei

    Institute of Scientific and Technical Information of China (English)

    Zhigang Fan; Gang Lv; Lingmin Zhang; Xiufeng Gan; Qiang Wu; Saifeng Zhong; Guogang Yan; Guifen Lin

    2011-01-01

    Objective: To search and analyze nitric oxide synthase (NOS) and similar proteins fromPlasmodium berghei(Pb). Methods: The structure and function of nitric oxide synthase and similar proteins from Plasmodium berghei were analyzed and predicted by bioinformatics. Results: PbNOS were not available, but nicotinamide adenine dinucleotide 2’-phosphate reduced tetrasodium (NADPH)-cytochrome p450 reductase(CPR) were gained. PbCPR was in the nucleus of Plasmodium berghei, while 134aa-229aa domain was localize in nucleolar organizer. The amino acids sequence of PbCPR had the closest genetic relationship with Plasmodium vivax showing a 73% homology. The tertiary structure of PbCPR displayed the forcep-shape with wings, but no wings existed in the tertiary structure of its’ host, Mus musculus(Mm). 137aa-200aa, 201aa-218aa, 220aa-230aa, 232aa-248, 269aa-323aa, 478aa-501aa and 592aa-606aa domains of PbCPR showed no homology with MmCPRs’, and all domains were exposed on the surface of the protein. Conclusions: NOS can’t be found in Plasmodium berghei and other Plasmodium species. PbCPR may be a possible resistance site of antimalarial drug, and the targets of antimalarial drug and vaccine. It may be also one of the mechanisms of immune evasion. This study on Plasmodium berghei may be more suitable to Plasmodium vivax. And137aa-200aa, 201aa-218aa, 220aa-230aa, 232aa-248, 269aa-323aa, 478aa-501aa and 592aa-606aa domains ofPb CPR are more ideal targets of antimalarial drug and vaccine.

  2. Genomics Politics through Space and Time: The Case of Bioinformatics in Brazil.

    Science.gov (United States)

    Bicudo, Edison

    2016-01-01

    The emergence of scientific disciplines, as well as the policies aimed to steer them, have geographical implications. This becomes visible in areas such as genomics and related fields. In this paper, the relation between scientific evolution, political decisions and geographical configuration is studied. The recent formation of bioinformatics in Brazil is focused on. The study involves an analysis of data collected on the website of CNPq, a funding agency attached to the Ministry of Science and Technology. Furthermore, I conducted fieldwork in four cities, interviewing 15 bioinformaticians. In the history of Brazilian bioinformatics, three periods can be identified. In the first period (1900-1996), bioinformatics was actually absent, but biology research groups were formed which would subsequently explore bioinformatics. The second period (1997-2006) was marked by the emergence of the discipline and geographical concentration of major research groups in the southern part of Brazil. A third period can be pointed to (2007-2014), in which political choices have turned geographical diffusion and institutional equality into a national target. As a consequence of the recent shifts, genomics and bioinformatics researchers have been involved in a debate, some defending the existence of few specialized research and sequencing platforms, whereas others welcoming the constitution of a scientific scenario based on decentralized platforms. I defend an intermediate solution, whereby some places would be selected to be genomics hubs. This would fit the regional diversity of this vast country, in addition to tackling the scientific weaknesses of the northern area.

  3. Bioinformatic analysis of the distribution of inorganic carbon transporters and prospective targets for bioengineering to increase Ci uptake by cyanobacteria.

    Science.gov (United States)

    Gaudana, Sandeep B; Zarzycki, Jan; Moparthi, Vamsi K; Kerfeld, Cheryl A

    2015-10-01

    Cyanobacteria have evolved a carbon-concentrating mechanism (CCM) which has enabled them to inhabit diverse environments encompassing a range of inorganic carbon (Ci: [Formula: see text] and CO2) concentrations. Several uptake systems facilitate inorganic carbon accumulation in the cell, which can in turn be fixed by ribulose 1,5-bisphosphate carboxylase/oxygenase. Here we survey the distribution of genes encoding known Ci uptake systems in cyanobacterial genomes and, using a pfam- and gene context-based approach, identify in the marine (alpha) cyanobacteria a heretofore unrecognized number of putative counterparts to the well-known Ci transporters of beta cyanobacteria. In addition, our analysis shows that there is a huge repertoire of transport systems in cyanobacteria of unknown function, many with homology to characterized Ci transporters. These can be viewed as prospective targets for conversion into ancillary Ci transporters through bioengineering. Increasing intracellular Ci concentration coupled with efforts to increase carbon fixation will be beneficial for the downstream conversion of fixed carbon into value-added products including biofuels. In addition to CCM transporter homologs, we also survey the occurrence of rhodopsin homologs in cyanobacteria, including bacteriorhodopsin, a class of retinal-binding, light-activated proton pumps. Because they are light driven and because of the apparent ease of altering their ion selectivity, we use this as an example of re-purposing an endogenous transporter for the augmentation of Ci uptake by cyanobacteria and potentially chloroplasts.

  4. Bioinformatics and functional analysis of an Entamoeba histolytica mannosyltransferase necessary for parasite complement resistance and hepatical infection.

    Directory of Open Access Journals (Sweden)

    Christian Weber

    Full Text Available The glycosylphosphatidylinositol (GPI moiety is one of the ways by which many cell surface proteins, such as Gal/GalNAc lectin and proteophosphoglycans (PPGs attach to the surface of Entamoeba histolytica, the agent of human amoebiasis. It is believed that these GPI-anchored molecules are involved in parasite adhesion to cells, mucus and the extracellular matrix. We identified an E. histolytica homolog of PIG-M, which is a mannosyltransferase required for synthesis of GPI. The sequence and structural analysis led to the conclusion that EhPIG-M1 is composed of one signal peptide and 11 transmembrane domains with two large intra luminal loops, one of which contains the DXD motif, involved in the enzymatic catalysis and conserved in most glycosyltransferases. Expressing a fragment of the EhPIG-M1 encoding gene in antisense orientation generated parasite lines diminished in EhPIG-M1 levels; these lines displayed reduced GPI production, were highly sensitive to complement and were dramatically inhibited for amoebic abscess formation. The data suggest a role for GPI surface anchored molecules in the survival of E. histolytica during pathogenesis.

  5. Integrative bioinformatics analysis of genomic and proteomic approaches to understand the transcriptional regulatory program in coronary artery disease pathways.

    Directory of Open Access Journals (Sweden)

    Rajani Kanth Vangala

    Full Text Available Patients with cardiovascular disease show a panel of differentially regulated serum biomarkers indicative of modulation of several pathways from disease onset to progression. Few of these biomarkers have been proposed for multimarker risk prediction methods. However, the underlying mechanism of the expression changes and modulation of the pathways is not yet addressed in entirety. Our present work focuses on understanding the regulatory mechanisms at transcriptional level by identifying the core and specific transcription factors that regulate the coronary artery disease associated pathways. Using the principles of systems biology we integrated the genomics and proteomics data with computational tools. We selected biomarkers from 7 different pathways based on their association with the disease and assayed 24 biomarkers along with gene expression studies and built network modules which are highly regulated by 5 core regulators PPARG, EGR1, ETV1, KLF7 and ESRRA. These network modules in turn comprise of biomarkers from different pathways showing that the core regulatory transcription factors may work together in differential regulation of several pathways potentially leading to the disease. This kind of analysis can enhance the elucidation of mechanisms in the disease and give better strategies of developing multimarker module based risk predictions.

  6. Integrative bioinformatics analysis of genomic and proteomic approaches to understand the transcriptional regulatory program in coronary artery disease pathways.

    Science.gov (United States)

    Vangala, Rajani Kanth; Ravindran, Vandana; Ghatge, Madan; Shanker, Jayashree; Arvind, Prathima; Bindu, Hima; Shekar, Meghala; Rao, Veena S

    2013-01-01

    Patients with cardiovascular disease show a panel of differentially regulated serum biomarkers indicative of modulation of several pathways from disease onset to progression. Few of these biomarkers have been proposed for multimarker risk prediction methods. However, the underlying mechanism of the expression changes and modulation of the pathways is not yet addressed in entirety. Our present work focuses on understanding the regulatory mechanisms at transcriptional level by identifying the core and specific transcription factors that regulate the coronary artery disease associated pathways. Using the principles of systems biology we integrated the genomics and proteomics data with computational tools. We selected biomarkers from 7 different pathways based on their association with the disease and assayed 24 biomarkers along with gene expression studies and built network modules which are highly regulated by 5 core regulators PPARG, EGR1, ETV1, KLF7 and ESRRA. These network modules in turn comprise of biomarkers from different pathways showing that the core regulatory transcription factors may work together in differential regulation of several pathways potentially leading to the disease. This kind of analysis can enhance the elucidation of mechanisms in the disease and give better strategies of developing multimarker module based risk predictions.

  7. Bioinformatic Analysis on Jellyfish Hematoxin%水母溶血毒素的生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    欧阳春磊; 高佳栋; 任玉坤; 肖良; 王倩倩; 郭玉峰; 蔡滨欣; 张黎明

    2009-01-01

    AIM: To illustrate the structure and function of five hematoxin sequences reported in four jellyfish and explore the possible mechanism of pathogenesis.METHODS: Bioinformatic methods were used to analyze the composition and sequence of amino acid residues, physico-chemical property, signal peptide, membrane spanning domain, hydrophobicity or hydrophilicity, secon- dary structure, conserved region and molecular phylogenetic evolution.RESULT and CONCLUSIONS: The results showed that the five jellyfish hematoxins were similar in composition and sequence of amino acid residues and physico-chemical property.The amino acid sequences of jellyfish hematoxins contained membrane spanning domain and hydrophobic regions; with a possible cleavage site in the signal peptide between the amino acid residues 20 and 21; α-helix and random coil were the major motifs of predicted secondary structure while β-tum and extended strand spread in the whole protein; There were four conserved amino acid sequences in three of the five hematoxins and similar phylogenetic trees were constructed by both NJ and MP methods.%目的与方法:为阐明已报道的5种水母溶血毒素的氨基酸组成和序列、信号肽、跨膜结构域、疏水性/亲水性、二级结构,保守区域、分子进化关系等,利用生物信息学方法对其进行了分析和预测.结果与结论:不同水母溶血毒素的氨基酸组成和理化性质相类似;水母溶血毒素存在跨膜结构域和疏水区域;在20~21位点最有可能存在信号肽切割位点:α螺旋、不规则卷曲是二级结构中最大量的结构元件,β折叠散布于整个蛋白质中;在其中3种水母毒素蛋白中存在4个保守区;以MP法和NJ法构建的系统发生树基本一致.

  8. 焦虑症外周血microRNA的生物信息学分析%Bioinformatics analysis of differently expressed microRNAs in anxiety disorder

    Institute of Scientific and Technical Information of China (English)

    范惠民; 牛威; 何明骏; 孔令明; 仲爱芳; 张巧丽; 闫妍; 张理义

    2015-01-01

    Objective To identify differentially expressed microRNAs(miRNA) in peripheral blood mononuclear cells(PBMCs) of anxiety patients and predict their target genes and function by bioinformatics analysis.Methods The miRNA expression profiles were determined using an Affymetrix array.To validate the results, real-time quantitative polymerase chain reaction(qRT-PCR) analysis in a larger cohort was employed.The targets of the differentially expressed miRNAs were predicted by Target Scan, miRBD, and DIANA-microT-CDS, and the results were analyzed by gene ontology(GO) and KEGG pathway analysis using FunNet.Results MicroRNA microarray chip analysis has identified 7 miRNAs were detected with significant changes in expression in PBMCs of anxiety patients.qRT-PCR analysis has confirmed that the expression levels of 5 miRNAs(has-miR-4484, has-miR-4505, has-miR-4674, has-miR-501-3p and has-miR-663) were up-regulated.Intersecting the genes by Target Scan, miRBD, and DIANA-microT-CDS has predicted 195 targets.GO analysis showed that biological processes regulated by the predicted target genes have included diverse terms.Some terms, e.g., nervous system development, nerve growth factor receptor signaling pathway, neuron migration, dendrite development, regulation of neuron projection development,midbrain development , regulation of excitatory postsynaptic membrane potential, gliogenesis, dendrite morphogenesis, etc.have direct relationship with the central nervous system and brain functions.Pathway analysis showed that a significant enrichment in several pathways related to neuronal brain functions such as glutamatergic synapse, axon guidance, calcium signaling pathway, MAPK signaling pathway, GnRH signaling pathway, Wnt signaling pathway, gap junction, long-term potentiation and VEGF signaling pathway, etc.Among the five microRNAs, has-miR-4484, has-miR-4505, has-miR-4674 and has-miR-501-3p may have more important regulatory functions.Conclusion Five miRNAs (has-miR-4484, has

  9. Identification and comparative analysis of H2O2-scavenging enzymes (ascorbate peroxidase and glutathione peroxidase in selected plants employing bioinformatics approaches

    Directory of Open Access Journals (Sweden)

    Ibrahim Ilker Ozyigit

    2016-03-01

    Full Text Available Among major reactive oxygen species (ROS, hydrogen peroxide (H2O2 exhibits dual roles in plant metabolism. Low levels of H2O2 modulate many biological/physiological processes in plants; whereas, its high level can cause damage to cell structures, having severe consequences. Thus, steady-state level of cellular H2O2 must be tightly regulated. Glutathione peroxidases (GPX and ascorbate peroxidase (APX are two major ROS-scavenging enzymes which catalyze the reduction of H2O2 in order to prevent potential H2O2-derived cellular damage. Employing bioinformatics approaches, this study presents a comparative evaluation of both GPX and APX in 18 different plant species, and provides valuable insights into the nature and complex regulation of these enzymes. Herein, (a potential GPX and APX genes/proteins from 18 different plant species were identified, (b their exon/intron organization were analyzed, (c detailed information about their physicochemical properties were provided, (d conserved motif signatures of GPX and APX were identified, (e their phylogenetic trees and 3D models were constructed, (f protein-protein interaction networks were generated, and finally (g GPX and APX gene expression profiles were analyzed. Study outcomes enlightened GPX and APX as major H2O2-scavenging enzymes at their structural and functional levels, which could be used in future studies in the current direction.

  10. Systematic enrichment analysis of gene expression profiling studies identifies consensus pathways implicated in colorectal cancer development

    Directory of Open Access Journals (Sweden)

    Jesús Lascorz

    2011-01-01

    Full Text Available Background: A large number of gene expression profiling (GEP studies on colorectal carcinogenesis have been performed but no reliable gene signature has been identified so far due to the lack of reproducibility in the reported genes. There is growing evidence that functionally related genes, rather than individual genes, contribute to the etiology of complex traits. We used, as a novel approach, pathway enrichment tools to define functionally related genes that are consistently up- or down-regulated in colorectal carcinogenesis. Materials and Methods: We started the analysis with 242 unique annotated genes that had been reported by any of three recent meta-analyses covering GEP studies on genes differentially expressed in carcinoma vs normal mucosa. Most of these genes (218, 91.9% had been reported in at least three GEP studies. These 242 genes were submitted to bioinformatic analysis using a total of nine tools to detect enrichment of Gene Ontology (GO categories or Kyoto Encyclopedia of Genes and Genomes (KEGG pathways. As a final consistency criterion the pathway categories had to be enriched by several tools to be taken into consideration. Results: Our pathway-based enrichment analysis identified the categories of ribosomal protein constituents, extracellular matrix receptor interaction, carbonic anhydrase isozymes, and a general category related to inflammation and cellular response as significantly and consistently overrepresented entities. Conclusions: We triaged the genes covered by the published GEP literature on colorectal carcinogenesis and subjected them to multiple enrichment tools in order to identify the consistently enriched gene categories. These turned out to have known functional relationships to cancer development and thus deserve further investigation.

  11. Functional analysis of TPM domain containing Rv2345 of Mycobacterium tuberculosis identifies its phosphatase activity.

    Science.gov (United States)

    Sinha, Avni; Eniyan, Kandasamy; Sinha, Swati; Lynn, Andrew Michael; Bajpai, Urmi

    2015-07-01

    Mycobacterium tuberculosis (Mtb) is the causal agent of tuberculosis, the second largest infectious disease. With the rise of multi-drug resistant strains of M. tuberculosis, serious challenge lies ahead of us in treating the disease. The availability of complete genome sequence of Mtb has improved the scope for identifying new proteins that would not only further our understanding of biology of the organism but could also serve to discover new drug targets. In this study, Rv2345, a hypothetical membrane protein of M. tuberculosis H37Rv, which is reported to be a putative ortholog of ZipA cell division protein has been assigned function through functional annotation using bioinformatics tools followed by experimental validation. Sequence analysis showed Rv2345 to have a TPM domain at its N-terminal region and predicted it to have phosphatase activity. The TPM domain containing region of Rv2345 was cloned and expressed using pET28a vector in Escherichia coli and purified by Nickel affinity chromatography. The purified TPM domain was tested in vitro and our results confirmed it to have phosphatase activity. The enzyme activity was first checked and optimized with pNPP as substrate, followed by using ATP, which was also found to be used as substrate by the purified protein. Hence sequence analysis followed by in vitro studies characterizes TPM domain of Rv2345 to contain phosphatase activity.

  12. Bioinformatic Analysis of GIF Protein Family in Chinese Cabbage%大白菜GIF蛋白家族的生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    王凤德; 李利斌; 李化银; 刘立峰; 高建伟

    2012-01-01

    GIF( GRF - interacting factor)家族是一类含有SNH和QG结构域的蛋白质,可与GRF( Growth regulating factor)转录因子蛋白相结合形成功能复合体,通过促进和维持细胞的分裂能力参与调控植物叶器官的发育.本研究系统鉴定了5个大白菜的GIF基因,并对这些基因编码的蛋白质序列进行了保守性和系统进化分析,最后对BrGIF1基因的表达进行了分析.结果表明,所有的大白菜和拟南芥GIF蛋白家族成员都具有高度保守的SNH和QG结构域.在进化上,GIF蛋白家族可分为两个不同的亚家族,并且这种特征在大白菜和拟南芥分离之前就已经形成.在表达模式上,BrGIF1基因在具有较大叶球的白菜自交系以及具有较强细胞分裂能力的组织中的转录表达水平较高.另外,BrGIF1基因的表达受到NAA的诱导和ABA的抑制.这些结果表明大白菜GIF蛋白可能具有和拟南芥GIF蛋白相似的生物学功能,在调控植物器官发育中具有重要作用.%GIF ( GRF - interacting factor) protein family is one kind of transcription coactivator which features the existence of SNH and QG domains. The members of this family could form a functional complex with GRF and act synergistically in regulating the development of leaves through the promotion and/or maintenance of cell proliferation activity in leaf primordia. In this study, five GIF genes were identified from Chinese cabbage by bioinformatics analysis. The conserved sequences of these proteins were analyzed and a polygenetic tree was constructed based on the corresponding GIF proteins from Chinese cabbage and Arabidopsis thaliana. The expression of these genes was analyzed, too. The results showed that all GIF proteins had the highly conservative SNH and QG domains and could be divided into two sub - family groups, which might have existed before the split of Chinese cabbage and Arabidopsis thalina. The expression level of BrGIF1 was higher not only in the inbred lines

  13. 瘢痕疙瘩相关基因的生物信息学分析%Literature Mining and Bioinformatic Analysis of Dysregulated Genes in Keloid

    Institute of Scientific and Technical Information of China (English)

    边曦; 黄琛; 李博仑; 秦泽莲

    2012-01-01

    Objective To explore the pathogenesis of keloid by comparing the gene expression in keloid and normal skin tissues, so that to seek new therapeutic approaches for keloid. Methods The differentially expressed genes between keloid and normal skin were obtained by mining PubMed. The dysregulated genes in keloid were analyzed by bioinformatics methods, including protein-protein interaction networks, biological pathways, gene ontology and functional annotation clustering analysis. Results Eight differential gene eipression datasets and 922 articles were obtained. A total of 94 dysregulated genes in keloid were identified (71 up-regulated genes and 23 down-regulated genes). Eighty-six genes were found to encode proteins with interaction network, including TGFB1, FN1, COL1A1, MMP9, VEGFA, TP53, IL6 and MMP2 as the central nodes for this network. The dysregulated genes in keloid were involved in a variety of biological pathways, including signal transduction and tumor formation. Furthermore, the dysregulated genes in keloid played important roles in biological processes of apoptosis and cell motility. Additionally, some of the dysregulated genes participated in cellular components expression, forming such as cell membrane structure, extracellular matrix and collagen components. Conclusions Key genes including TGFB1, FN1, COL1A1, MMP9, VEGFA, TP53, IL6, and MMP2, along with TGF- β signal transduction, cell proliferation and apoptosis, tumor formation may play important roles in the development of keloid.%目的 比较瘢痕疙瘩与正常皮肤的基因表达差异,从分子水平探讨瘢痕疙瘩的发病机制,为临床治疗提供新思路. 方法 用PubMed数据库文献检索瘢痕疙瘩与正常皮肤的差异表达基因,对与瘢痕疙瘩相关的基因进行蛋白-蛋白相互作用网络、生物学通路、基因本体( gene ontology,GO)和功能注释聚类的生物信息学分析. 结果 获得差异表达基因谱8个和文献922篇,

  14. 草菇α-淀粉酶基因的生物信息学分析%Bioinformatic Analysis of α-Amylase Genes in Volvariella volvacea

    Institute of Scientific and Technical Information of China (English)

    杜慕云; 杨仁德; 李剑; 谢宝贵

    2014-01-01

    Five genes (GME 2151、GME 6695、GME 9075、GME 1069 and GME 10705 ) were identified as encoding α-amylases in Volvariella volvacea , the molecular weights of which varied from 38.8 kD to 64.4 kD.Bioinformatic methods based on genome and transcriptome sequences have been used to analyze gene intron:exon distribution patterns and the physicochemical properties of the encoded α-amylases.Signal peptides,sub-cellular localization patterns and functional sites of the α-amylases were predicted,and a phylogenetic tree was constructed based onα-amylases from different fungi.Serine phosphorylation sites were the primary sites of amylase protein phosphorylation. The amylases contained signal peptides, transmembrane helices,conserved amino acid residues,similar three dimensional structures of amylase,and were located both intra-and extracellularly.Analysis of the phylogenetic tree revealed that the α-amylases were of two types:GME9075 and GME10698 belonged to α-amylase type I,and GME2151,GME6695 and GME10705 α-amylase type II. This is consistent with the classification of amylases from other basidiomycetes.Our data provide useful information relating to matrix degradation by the mycelium of V .volvacea and other macro-basidiomycetes.%基于草菇(Volvariella volvacea )基因组和转录组数据,通过生物信息学的方法对草菇α-淀粉酶基因进行基本理化性质、内含子和外显子结构、信号肽、亚细胞定位和功能位点的预测与分析,并构建系统发育树。结果表明:编码草菇α-淀粉酶的基因有5个,分别为 GME 2151、GME 6695、GME 9075、GME 10698和GME 10705;5个基因编码的蛋白相对分子量介于38.8~64.6 kD 之间,磷酸化以 Ser 位点为主,大都存在信号肽,亚细胞定位在细胞外,保守结构域和空间结构相似度较高。和其它的担子菌一样,草菇α-淀粉酶可以分为两类:GME9075和 GME10698归为α-淀粉酶Ⅰ类,GME2151、GME6695和 GME10705属于α-淀粉酶Ⅱ类。

  15. 草菇α-淀粉酶基因的生物信息学分析%Bioinformatic Analysis of α-Amylase Genes in Volvariella volvacea

    Institute of Scientific and Technical Information of China (English)

    杜慕云; 杨仁德; 李剑; 谢宝贵

    2014-01-01

    基于草菇(Volvariella volvacea )基因组和转录组数据,通过生物信息学的方法对草菇α-淀粉酶基因进行基本理化性质、内含子和外显子结构、信号肽、亚细胞定位和功能位点的预测与分析,并构建系统发育树。结果表明:编码草菇α-淀粉酶的基因有5个,分别为 GME 2151、GME 6695、GME 9075、GME 10698和GME 10705;5个基因编码的蛋白相对分子量介于38.8~64.6 kD 之间,磷酸化以 Ser 位点为主,大都存在信号肽,亚细胞定位在细胞外,保守结构域和空间结构相似度较高。和其它的担子菌一样,草菇α-淀粉酶可以分为两类:GME9075和 GME10698归为α-淀粉酶Ⅰ类,GME2151、GME6695和 GME10705属于α-淀粉酶Ⅱ类。%Five genes (GME 2151、GME 6695、GME 9075、GME 1069 and GME 10705 ) were identified as encoding α-amylases in Volvariella volvacea , the molecular weights of which varied from 38.8 kD to 64.4 kD.Bioinformatic methods based on genome and transcriptome sequences have been used to analyze gene intron:exon distribution patterns and the physicochemical properties of the encoded α-amylases.Signal peptides,sub-cellular localization patterns and functional sites of the α-amylases were predicted,and a phylogenetic tree was constructed based onα-amylases from different fungi.Serine phosphorylation sites were the primary sites of amylase protein phosphorylation. The amylases contained signal peptides, transmembrane helices,conserved amino acid residues,similar three dimensional structures of amylase,and were located both intra-and extracellularly.Analysis of the phylogenetic tree revealed that the α-amylases were of two types:GME9075 and GME10698 belonged to α-amylase type I,and GME2151,GME6695 and GME10705 α-amylase type II. This is consistent with the classification of amylases from other basidiomycetes.Our data provide useful information relating to matrix degradation by the mycelium of V .volvacea and other macro-basidiomycetes.

  16. Microfluidic single-cell transcriptional analysis rationally identifies novel surface marker profiles to enhance cell-based therapies.

    Science.gov (United States)

    Rennert, Robert C; Januszyk, Michael; Sorkin, Michael; Rodrigues, Melanie; Maan, Zeshaan N; Duscher, Dominik; Whittam, Alexander J; Kosaraju, Revanth; Chung, Michael T; Paik, Kevin; Li, Alexander Y; Findlay, Michael; Glotzbach, Jason P; Butte, Atul J; Gurtner, Geoffrey C

    2016-01-01

    Current progenitor cell therapies have only modest efficacy, which has limited their clinical adoption. This may be the result of a cellular heterogeneity that decreases the number of functional progenitors delivered to diseased tissue, and prevents correction of underlying pathologic cell population disruptions. Here, we develop a high-resolution method of identifying phenotypically distinct progenitor cell subpopulations via single-cell transcriptional analysis and advanced bioinformatics. When combined with high-throughput cell surface marker screening, this approach facilitates the rational selection of surface markers for prospective isolation of cell subpopulations with desired transcriptional profiles. We establish the usefulness of this platform in costly and highly morbid diabetic wounds by identifying a subpopulation of progenitor cells that is dysfunctional in the diabetic state, and normalizes diabetic wound healing rates following allogeneic application. We believe this work presents a logical framework for the development of targeted cell therapies that can be customized to any clinical application. PMID:27324848

  17. Bioinformatics education dissemination with an evolutionary problem solving perspective.

    Science.gov (United States)

    Jungck, John R; Donovan, Samuel S; Weisstein, Anton E; Khiripet, Noppadon; Everse, Stephen J

    2010-11-01

    Bioinformatics is central to biology education in the 21st century. With the generation of terabytes of data per day, the application of computer-based tools to stored and distributed data is fundamentally changing research and its application to problems in medicine, agriculture, conservation and forensics. In light of this 'information revolution,' undergraduate biology curricula must be redesigned to prepare the next generation of informed citizens as well as those who will pursue careers in the life sciences. The BEDROCK initiative (Bioinformatics Education Dissemination: Reaching Out, Connecting and Knitting together) has fostered an international community of bioinformatics educators. The initiative's goals are to: (i) Identify and support faculty who can take leadership roles in bioinformatics education; (ii) Highlight and distribute innovative approaches to incorporating evolutionary bioinformatics data and techniques throughout undergraduate education; (iii) Establish mechanisms for the broad dissemination of bioinformatics resource materials and teaching models; (iv) Emphasize phylogenetic thinking and problem solving; and (v) Develop and publish new software tools to help students develop and test evolutionary hypotheses. Since 2002, BEDROCK has offered more than 50 faculty workshops around the world, published many resources and supported an environment for developing and sharing bioinformatics education approaches. The BEDROCK initiative builds on the established pedagogical philosophy and academic community of the BioQUEST Curriculum Consortium to assemble the diverse intellectual and human resources required to sustain an international reform effort in undergraduate bioinformatics education. PMID:21036947

  18. Pattern recognition in bioinformatics.

    Science.gov (United States)

    de Ridder, Dick; de Ridder, Jeroen; Reinders, Marcel J T

    2013-09-01

    Pattern recognition is concerned with the development of systems that learn to solve a given problem using a set of example instances, each represented by a number of features. These problems include clustering, the grouping of similar instances; classification, the task of assigning a discrete label to a given instance; and dimensionality reduction, combining or selecting features to arrive at a more useful representation. The use of statistical pattern recognition algorithms in bioinformatics is pervasive. Classification and clustering are often applied to high-throughput measurement data arising from microarray, mass spectrometry and next-generation sequencing experiments for selecting markers, predicting phenotype and grouping objects or genes. Less explicitly, classification is at the core of a wide range of tools such as predictors of genes, protein function, functional or genetic interactions, etc., and used extensively in systems biology. A course on pattern recognition (or machine learning) should therefore be at the core of any bioinformatics education program. In this review, we discuss the main elements of a pattern recognition course, based on material developed for courses taught at the BSc, MSc and PhD levels to an audience of bioinformaticians, computer scientists and life scientists. We pay attention to common problems and pitfalls encountered in applications and in interpretation of the results obtained.

  19. Initial Bioinformatics Analysis and Verification of a Novel Gene Named Nischarin%新基因Nischarin生物信息学分析及初步验证

    Institute of Scientific and Technical Information of China (English)

    赵太云; 王勃; 苏瑞斌; 吴宁; 李锦

    2012-01-01

    目的:对新基因Nischarin进行生物信息学分析,探索其新功能特征,并通过实验进行初步验证.方法:用生物信息学方法对Nischarin进行初步分析,阐明了它的基因结构、染色体定位、编码蛋白质的理化性质、相互作用基因、相互作用蛋白、亚细胞定位、蛋白质功能域等信息.最后采用细胞免疫荧光对其DNA结合位点进行初步验证.结果:对新基因Nischarin的上述性质进行了有效的预测,分析表明该基因结构复杂,相互作用基因或蛋白多,亚细胞分布预测复杂.验证了Nishcarin存在的DNA结合位点.结论:通过生物信息学分析,表明新基因Nischarin是一个复杂的基因,可能存在的多种蛋白表达形式、这些不同的蛋白可能存在不同的亚细胞分布,且该蛋白可能与多种蛋白存在相互作用,上述基因和蛋白特性可能是Ⅰ型咪唑啉受体(Imidazoline-1 receptor,I1R)复杂药理学作用的分子基础.%Objective: To explore the new function of the novel gene named Nischarin and verify the real nature, the gene was analyzed by the bioinformatics analysis. Methods: Initial bioinformatics analysis was performed on the novel gene named Nischarin. Its gene structure, genome localization, the physical and chemical characteristics of the protein, subcellular localization of the protein, functional domain and so on was predicted. On the basis of the predicted result, the DNA binding motifs (Leucine zipper pattern) was verified by cell immunofluorescence. Results: Through efficient bioinformatics analysis, Nischarin gene's structure is very complex. It may be expressed by many kinds of proteins. The proteins have no signal peptide and it's subcellular localization is complex. There may be a DNA binding motifs in the Nischarin protein. Conclusion: Nischarin is a very complex gene, it may be expressed many kinds of proteins which maybe have different subcellular localization. Nischarin protein has interaction with

  20. Bioinformatics in microbial biotechnology – a mini review

    Directory of Open Access Journals (Sweden)

    Bansal Arvind K

    2005-06-01

    Full Text Available Abstract The revolutionary growth in the computation speed and memory storage capability has fueled a new era in the analysis of biological data. Hundreds of microbial genomes and many eukaryotic genomes including a cleaner draft of human genome have been sequenced raising the expectation of better control of microorganisms. The goals are as lofty as the development of rational drugs and antimicrobial agents, development of new enhanced bacterial strains for bioremediation and pollution control, development of better and easy to administer vaccines, the development of protein biomarkers for various bacterial diseases, and better understanding of host-bacteria interaction to prevent bacterial infections. In the last decade the development of many new bioinformatics techniques and integrated databases has facilitated the realization of these goals. Current research in bioinformatics can be classified into: (i genomics – sequencing and comparative study of genomes to identify gene and genome functionality, (ii proteomics – identification and characterization of protein related properties and reconstruction of metabolic and regulatory pathways, (iii cell visualization and simulation to study and model cell behavior, and (iv application to the development of drugs and anti-microbial agents. In this article, we will focus on the techniques and their limitations in genomics and proteomics. Bioinformatics research can be classified under three major approaches: (1 analysis based upon the available experimental wet-lab data, (2 the use of mathematical modeling to derive new information, and (3 an integrated approach that integrates search techniques with mathematical modeling. The major impact of bioinformatics research has been to automate the genome sequencing, automated development of integrated genomics and proteomics databases, automated genome comparisons to identify the genome function, automated derivation of metabolic pathways, gene

  1. Expression Data Analysis to Identify Biomarkers Associated with Asthma in Children

    OpenAIRE

    Wen Xu

    2014-01-01

    Asthma is characterized by recurrent episodes of wheezing, shortness of breath, chest tightness, and coughing. It is usually caused by a combination of complex and incompletely understood environmental and genetic interactions. We obtained gene expression data with high-throughput screening and identified biomarkers of children's asthma using bioinformatics tools. Next, we explained the pathogenesis of children's asthma from the perspective of gene regulatory networks: DAVID was applied to pe...

  2. 细粒棘球绦虫14-3-3zeta蛋白的生物信息学分析%Application of bioinformatic analysis in 14-3-3zeta protein of Echinococcus granulosus

    Institute of Scientific and Technical Information of China (English)

    符瑞佳; 吕刚; 尹飞飞; 梁培

    2015-01-01

    目的:应用生物信息学技术对细粒棘球绦虫(Echinococcus granulosus)14-3-3zeta蛋白的结构和功能进行预测和分析,为进一步的实验研究提供依据。方法利用美国国家生物技术信息中心(NCBI,http://www.ncbi.nlm.nih.gov/)和瑞士生物信息学研究所的蛋白分析专家系统(ExPASY,http://expasy.org/)提供的各种有关基因和蛋白序列、结构信息分析的工具,并结合其它生物信息学分析软件,对该蛋白质的结构和功能进行预测和分析。结果该基因全长为771 bp ,编码256个氨基酸,其编码的蛋白相对分子量理论预测值和等电点分别是29.4 kDa和5.04。预测该蛋白无信号肽和跨膜区,二级结构含8个α-螺旋和12个β-折叠股,氨基酸序列中有9个潜在抗原表位。结论初步认识了细粒棘球绦虫14-3-3zeta蛋白的基本特征,为深入研究该蛋白的生物学功能奠定了基础。%Objective To predict and analyze the structure and function of 14-3-3zeta protein from Echinococcus granulosus by bioinformatics technology. Methods The structure and function of Eg14-3-3zeta protein was identified from two biological information sites, USA National Center for Biotechnology Information (NCBI, http://www.ncbi.nlm.nih.gov/), and Expert System for analysis of protein of the Swiss Institute of bioinformatics (ExPASY,http://expasy.org/), which offer the analysis of various related gene and protein sequence, structure information tools, and other bioinformatics analysis software. Results The full-length cDNA sequence encoding Eg14-3-3zeta included a complete open reading frame (ORF) of 771 bp coding to a putative protein with 256 amino acids. Molecular weight of Eg14-3-3zeta was predicted to be 29.4 kDa and its isoelectric point was 5.04. The protein had no signal peptide site and transmembrane do-main. Secondary structure of Eg14-3-3zeta contained 8 alpha-helices and 12 beta-strands.There were

  3. GALT Protein Database, a Bioinformatics Resource for the Manage-ment and Analysis of Structural Features of a Galactosemia-related Protein and Its Mutants

    Institute of Scientific and Technical Information of China (English)

    Antonio d'Acierno; Angelo Facchiano; Anna Marabotti

    2009-01-01

    We describe the GALT-Prot database and its related web-based application that have been developed to collect information about the structural and functional effects of mutations on the human enzyme galactose-1-phosphate uridyltransferase (GALT) involved in the genetic disease named galactosemia type Ⅰ. Besides a list of missense mutations at gene and protein sequence levels, GALT-Prot reports the analysis results of mutant GALT structures. In addition to the structural information about the wild-type enzyme, the database also includes structures of over 100 single point mutants simulated by means of a computational procedure, and the analysis to each mutant was made with several bioinformatics programs in order to investigate the effect of the mutations. The web-based interface allows querying of the database, and several links are also provided in order to guarantee a high integration with other resources already present on the web. Moreover, the architecture of the database and the web application is flexible and can be easily adapted to store data related to other proteins with point mutations. GALT-Prot is freely available at http://bioinformatica.isa.cnr.it/GALT/.

  4. 光滑念珠菌 Cdc42基因生物信息分析%Bioinformatics Analysis of Cdc42 Gene from Candida Glabrata

    Institute of Scientific and Technical Information of China (English)

    赵静; 黄怀球; 袁立燕; 钟毅; 张静; 张晓辉

    2013-01-01

    目的:分析和预测光滑念珠菌Cdc42基因及其编码蛋白的结构和特性。方法:利用NCBI、Ex-PASy和CBS网站中的各种信息分析工具,并结合Vector NTI suite 8.0生物信息学分析软件包,分析预测光滑念珠菌Cdc42基因并预测该基因编码蛋白结构的特征和功能。结果:Cdc42基因全长为576 bp,编码区具有191个氨基酸,在GenBank同源序列中,其与酵母 Cdc42氨基酸序列一致性达到99%,且有Cdc42保守域。 Cdc42蛋白相对分子量预测为21420.83,理论等电点为6.31。预测Cdc42编码蛋白ɑ螺旋(H)、β折叠(E)、无规则卷(L)的比例分别是29.84%、28.70%、41.88%,1个GTP/ATP结合位点。 Cdc42蛋白为疏水蛋白,无跨膜区,无信号肽。结论:成功预测Cdc42基因及编码蛋白生化及结构特征,为下一步对其进行克隆和表达奠定基础。%Objective:To analyze and predict the structure and properties about encoding pro-tein of cell division cycle 42(Cdc42) from Candida glabrata by bioinformatics.Methods:A full-length cDNA sequence encoding Cdc 42 from Candida glabrata was identified by using tools of bioinformatics at webs sites of NCBI , ExPASy, CBS and software Vector NTI suite 8.0.The char-acteristics of the protein were predicted by employing bioinformatics software package supplied by the website of ExPaSy .Results:The full length of Cdc42 is 576 bp, and its ORF encodes 191 ami-no acid.The relationship of phylogenesis between Candida glabrata and other fungus is close .The prediction shows that Cdc 42 had a Cdc42 conserved domain , the molecular weight and theoretical pI of Cg.Cdc42 was 21 420.83 and 6.31 respectively, and the coding protein contains 29.84%ɑ-helix, 28.70%extended strand,41.88% random coil,and one GTP/ATP motif.Cdc42 enco-ding protein is hydrophobic , extra-membrane protein , without signal peptide .Conclusion:The structure and characteristics of the gene and protein of Cg .Cdc42 was

  5. Alteration of microRNA expression in cerebrospinal fluid of unconscious patients after traumatic brain injury and a bioinformatic analysis of related single nucleotide polymorphisms

    Institute of Scientific and Technical Information of China (English)

    Wen-Dong You; Qi-Lin Tang; Lei Wang; Jin Lei; Jun-Feng Feng; Qing Mao; Guo-Yi Gao

    2016-01-01

    Purpose:It is becoming increasingly clear that genetic factors play a role in traumatic brain injury (TBI),whether in modifying clinical outcome after TBI or determining susceptibility to it.MicroRNAs are small RNA molecules involved in various pathophysiological processes by repressing target genes at the posttranscriptional level,and TBI alters microRNA expression levels in the hippocampus and cortex.This study was designed to detect differentially expressed microRNAs in the cerebrospinal fluid (CSF) of TBI patients remaining unconscious two weeks after initial injury and to explore related single nucleotide polymorphisms (SNPs).Methods:We used a microarray platform to detect differential microRNA expression levels in CSF samples from patients with post-traumatic coma compared with samples from controls.A bioinformatic scan was performed covering microRNA gene promoter regions to identify potential functional SNPs.Results:Totally 26 coma patients and 21 controls were included in this study,with similar distribution of age and gender between the two groups.Microarray showed that fourteen microRNAs were differentially expressed,ten at higher and four at lower expression levels in CSF of traumatic coma patients compared with controls (p < 0.05).One SNP (rs11851174 allele:C/T) was identified in the motif area of the microRNA hsa-miR-431-3P gene promoter region.Conclusion:The altered microRNA expression levels in CSF after brain injury together with SNP identified within the microRNA gene promoter area provide a new perspective on the mechanism of impaired consciousness after TBI.Further studies are needed to explore the association between the specific microRNAs and their related SNPs with post-traumatic unconsciousness.

  6. Virtual Bioinformatics Distance Learning Suite

    Science.gov (United States)

    Tolvanen, Martti; Vihinen, Mauno

    2004-01-01

    Distance learning as a computer-aided concept allows students to take courses from anywhere at any time. In bioinformatics, computers are needed to collect, store, process, and analyze massive amounts of biological and biomedical data. We have applied the concept of distance learning in virtual bioinformatics to provide university course material…

  7. Expression Data Analysis to Identify Biomarkers Associated with Asthma in Children

    Directory of Open Access Journals (Sweden)

    Wen Xu

    2014-01-01

    Full Text Available Asthma is characterized by recurrent episodes of wheezing, shortness of breath, chest tightness, and coughing. It is usually caused by a combination of complex and incompletely understood environmental and genetic interactions. We obtained gene expression data with high-throughput screening and identified biomarkers of children's asthma using bioinformatics tools. Next, we explained the pathogenesis of children's asthma from the perspective of gene regulatory networks: DAVID was applied to perform Kyoto Encyclopedia of Genes and Genomes (KEGG pathway enriching analysis for the top 3000 pairs of relationships in differentially regulatory network. Finally, we found that HAND1, PTK1, NFKB1, ZIC3, STAT6, E2F1, PELP1, USF2, and CBFB may play important roles in children's asthma initiation. On account of regulatory impact factor (RIF score, HAND1, PTK7, and ZIC3 were the potential asthma-related factors. Our study provided some foundations of a strategy for biomarker discovery despite a poor understanding of the mechanisms underlying children's asthma.

  8. Emergent Computation Emphasizing Bioinformatics

    CERN Document Server

    Simon, Matthew

    2005-01-01

    Emergent Computation is concerned with recent applications of Mathematical Linguistics or Automata Theory. This subject has a primary focus upon "Bioinformatics" (the Genome and arising interest in the Proteome), but the closing chapter also examines applications in Biology, Medicine, Anthropology, etc. The book is composed of an organized examination of DNA, RNA, and the assembly of amino acids into proteins. Rather than examine these areas from a purely mathematical viewpoint (that excludes much of the biochemical reality), the author uses scientific papers written mostly by biochemists based upon their laboratory observations. Thus while DNA may exist in its double stranded form, triple stranded forms are not excluded. Similarly, while bases exist in Watson-Crick complements, mismatched bases and abasic pairs are not excluded, nor are Hoogsteen bonds. Just as there are four bases naturally found in DNA, the existence of additional bases is not ignored, nor amino acids in addition to the usual complement of...

  9. Bioinformatics meets parasitology.

    Science.gov (United States)

    Cantacessi, C; Campbell, B E; Jex, A R; Young, N D; Hall, R S; Ranganathan, S; Gasser, R B

    2012-05-01

    The advent and integration of high-throughput '-omics' technologies (e.g. genomics, transcriptomics, proteomics, metabolomics, glycomics and lipidomics) are revolutionizing the way biology is done, allowing the systems biology of organisms to be explored. These technologies are now providing unique opportunities for global, molecular investigations of parasites. For example, studies of a transcriptome (all transcripts in an organism, tissue or cell) have become instrumental in providing insights into aspects of gene expression, regulation and function in a parasite, which is a major step to understanding its biology. The purpose of this article was to review recent applications of next-generation sequencing technologies and bioinformatic tools to large-scale investigations of the transcriptomes of parasitic nematodes of socio-economic significance (particularly key species of the order Strongylida) and to indicate the prospects and implications of these explorations for developing novel methods of parasite intervention.

  10. Engineering BioInformatics

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    @@ With the completion of human genome sequencing, a new era of bioinformatics st arts. On one hand, due to the advance of high throughput DNA microarray technol ogies, functional genomics such as gene expression information has increased exp onentially and will continue to do so for the foreseeable future. Conventional m eans of storing, analysing and comparing related data are already overburdened. Moreover, the rich information in genes , their functions and their associated wide biological implication requires new technologies of analysing data that employ sophisticated statistical and machine learning algorithms, powerful com puters and intensive interaction together different data sources such as seque nce data, gene expression data, proteomics data and metabolic pathway informati on to discover complex genomic structures and functional patterns with other bi ological process to gain a comprehensive understanding of cell physiology.

  11. Virtual bioinformatics distance learning suite*.

    Science.gov (United States)

    Tolvanen, Martti; Vihinen, Mauno

    2004-05-01

    Distance learning as a computer-aided concept allows students to take courses from anywhere at any time. In bioinformatics, computers are needed to collect, store, process, and analyze massive amounts of biological and biomedical data. We have applied the concept of distance learning in virtual bioinformatics to provide university course material over the Internet. Currently, we provide two fully computer-based courses, "Introduction to Bioinformatics" and "Bioinformatics in Functional Genomics." Here we will discuss the application of distance learning in bioinformatics training and our experiences gained during the 3 years that we have run the courses, with about 400 students from a number of universities. The courses are available at bioinf.uta.fi.

  12. Next Generation Sequencing of Elite Berry Germplasm and Data Analysis Using a Bioinformatics Pipeline for Virus Detection and Discovery

    Science.gov (United States)

    Berry crops (members of the genera Fragaria, Ribes, Rubus, Sambucus and Vaccinium) are known hosts for more than 70 viruses and new ones are identified continually. In modern berry cultivars, viruses tend to be be asymptomatic in single infections and symptoms only develop after plants accumulate m...

  13. Next-Generation Sequencing of Elite Berry Germplasm and Data Analysis Using a Bioinformatics Pipeline for Virus Detection and Discovery

    Science.gov (United States)

    Berry crops (members of the genera Fragaria, Ribes, Rubus, Sambucus and Vaccinium) are known hosts for more than 70 viruses and new ones are identified frequently. In modern berry cultivars, viruses tend to be asymptomatic in single infections and symptoms only develop after plants accumulate multip...

  14. Forensic Bioinformatics: An innovative technological advancement in the field of Forensic Medicine and Diagnosis

    Directory of Open Access Journals (Sweden)

    Kumar Ajay

    2012-01-01

    Full Text Available Background: The role of Bioinformatics in this modern age of technology advancement can not be over-emphasized. Aim: This study reviews the principle, techniques, and applications of Forensic Bioinformatics. Methods and Materials: Literature searches were done to identify relevant studies. Results: The concepts of sequence annotation and whole genome sequencing were possible due to the assimilation of software based tools which are exclusively responsible for the segregation of bulk genomic data. DNA profiling produces profiles which are the encrypted sets of numbers that reflect a person's DNA makeup, which can also be used as the person's identifier. Implementation of automated analysis system coupled with latest computer based software’s making the results easy to comprehend. Major application of forensic Bioinformatics in the field of forensic science includes quick, bulk and precise review of the DNA evidence with the intent of finding and drawing attention to recurring problems so that the testing continues to better and more reliable. Present day, Genetic Counsellors are also used the derived information of Genomic data for creating pedigree in case of genetic disorders. Conclusion: It is important that with the usefulness of Forensic Bioinformatics, a far greater commitment to openness and transparency and a greater availability of documents to public scrutiny is recommended.

  15. Bioconductor: open software development for computational biology and bioinformatics

    DEFF Research Database (Denmark)

    Gentleman, R.C.; Carey, V.J.; Bates, D.M.;

    2004-01-01

    The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry into interdisci......The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry...... into interdisciplinary scientific research, and promoting the achievement of remote reproducibility of research results. We describe details of our aims and methods, identify current challenges, compare Bioconductor to other open bioinformatics projects, and provide working examples....

  16. Adapting bioinformatics curricula for big data

    OpenAIRE

    Greene, Anna C.; Giffin, Kristine A.; Greene, Casey S; Jason H Moore

    2015-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these...

  17. An Integrated Bioinformatics Analysis Reveals Divergent Evolutionary Pattern of Oil Biosynthesis in High- and Low-Oil Plants

    OpenAIRE

    Zhang, Li; Wang, Shi-Bo; Li, Qi-Gang; Song, Jian; Hao, Yu-Qi; Zhou, Ling; Zheng, Huan-Quan; Jim M Dunwell; Zhang, Yuan-Ming

    2016-01-01

    Seed oils provide a renewable source of food, biofuel and industrial raw materials that is important for humans. Although many genes and pathways for acyl-lipid metabolism have been identified, little is known about whether there is a specific mechanism for high-oil content in high-oil plants. Based on the distinct differences in seed oil content between four high-oil dicots (20~50%) and three low-oil grasses (

  18. Bioinformatics analyses for signal transduction networks

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    Research in signaling networks contributes to a deeper understanding of organism living activities. With the development of experimental methods in the signal transduction field, more and more mechanisms of signaling pathways have been discovered. This paper introduces such popular bioin-formatics analysis methods for signaling networks as the common mechanism of signaling pathways and database resource on the Internet, summerizes the methods of analyzing the structural properties of networks, including structural Motif finding and automated pathways generation, and discusses the modeling and simulation of signaling networks in detail, as well as the research situation and tendency in this area. Now the investigation of signal transduction is developing from small-scale experiments to large-scale network analysis, and dynamic simulation of networks is closer to the real system. With the investigation going deeper than ever, the bioinformatics analysis of signal transduction would have immense space for development and application.

  19. Agile parallel bioinformatics workflow management using Pwrake

    Directory of Open Access Journals (Sweden)

    Tanaka Masahiro

    2011-09-01

    Full Text Available Abstract Background In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environment are often prioritized in scientific workflow management. These features have a greater affinity with the agile software development method through iterative development phases after trial and error. Here, we show the application of a scientific workflow system Pwrake to bioinformatics workflows. Pwrake is a parallel workflow extension of Ruby's standard build tool Rake, the flexibility of which has been demonstrated in the astronomy domain. Therefore, we hypothesize that Pwrake also has advantages in actual bioinformatics workflows. Findings We implemented the Pwrake workflows to process next generation sequencing data using the Genomic Analysis Toolkit (GATK and Dindel. GATK and Dindel workflows are typical examples of sequential and parallel workflows, respectively. We found that in practice, actual scientific workflow development iterates over two phases, the workflow definition phase and the parameter adjustment phase. We introduced separate workflow definitions to help focus on each of the two developmental phases, as well as helper methods to simplify the descriptions. This approach increased iterative development efficiency. Moreover, we implemented combined workflows to demonstrate modularity of the GATK and Dindel workflows. Conclusions Pwrake enables agile management of scientific workflows in the bioinformatics domain. The internal domain specific language design built on Ruby gives the flexibility of rakefiles for writing scientific workflows

  20. Biophysics and bioinformatics of transcription regulation in bacteria and bacteriophages

    Science.gov (United States)

    Djordjevic, Marko

    2005-11-01

    Due to rapid accumulation of biological data, bioinformatics has become a very important branch of biological research. In this thesis, we develop novel bioinformatic approaches and aid design of biological experiments by using ideas and methods from statistical physics. Identification of transcription factor binding sites within the regulatory segments of genomic DNA is an important step towards understanding of the regulatory circuits that control expression of genes. We propose a novel, biophysics based algorithm, for the supervised detection of transcription factor (TF) binding sites. The method classifies potential binding sites by explicitly estimating the sequence-specific binding energy and the chemical potential of a given TF. In contrast with the widely used information theory based weight matrix method, our approach correctly incorporates saturation in the transcription factor/DNA binding probability. This results in a significant reduction in the number of expected false positives, and in the explicit appearance---and determination---of a binding threshold. The new method was used to identify likely genomic binding sites for the Escherichia coli TFs, and to examine the relationship between TF binding specificity and degree of pleiotropy (number of regulatory targets). We next address how parameters of protein-DNA interactions can be obtained from data on protein binding to random oligos under controlled conditions (SELEX experiment data). We show that 'robust' generation of an appropriate data set is achieved by a suitable modification of the standard SELEX procedure, and propose a novel bioinformatic algorithm for analysis of such data. Finally, we use quantitative data analysis, bioinformatic methods and kinetic modeling to analyze gene expression strategies of bacterial viruses. We study bacteriophage Xp10 that infects rice pathogen Xanthomonas oryzae. Xp10 is an unusual bacteriophage, which has morphology and genome organization that most closely

  1. A bioinformatics insight to rhizobial globins: gene identification and mapping, polypeptide sequence and phenetic analysis, and protein modeling. [v1; ref status: indexed, http://f1000r.es/5ai

    Directory of Open Access Journals (Sweden)

    Reinier Gesto-Borroto

    2015-05-01

    Full Text Available Globins (Glbs are proteins widely distributed in organisms. Three evolutionary families have been identified in Glbs: the M, S and T Glb families. The M Glbs include flavohemoglobins (fHbs and single-domain Glbs (SDgbs; the S Glbs include globin-coupled sensors (GCSs, protoglobins and sensor single domain globins, and the T Glbs include truncated Glbs (tHbs. Structurally, the M and S Glbs exhibit 3/3-folding whereas the T Glbs exhibit 2/2-folding. Glbs are widespread in bacteria, including several rhizobial genomes. However, only few rhizobial Glbs have been characterized. Hence, we characterized Glbs from 62 rhizobial genomes using bioinformatics methods such as data mining in databases, sequence alignment, phenogram construction and protein modeling. Also, we analyzed soluble extracts from Bradyrhizobium japonicum USDA38 and USDA58 by (reduced + carbon monoxide (CO minus reduced differential spectroscopy. Database searching showed that only fhb, sdgb, gcs and thb genes exist in the rhizobia analyzed in this work. Promoter analysis revealed that apparently several rhizobial glb genes are not regulated by a -10 promoter but might be regulated by -35 and Fnr (fumarate-nitrate reduction regulator-like promoters. Mapping analysis revealed that rhizobial fhbs and thbs are flanked by a variety of genes whereas several rhizobial sdgbs and gcss are flanked by genes coding for proteins involved in the metabolism of nitrates and nitrites and chemotaxis, respectively. Phenetic analysis showed that rhizobial Glbs segregate into the M, S and T Glb families, while structural analysis showed that predicted rhizobial SDgbs and fHbs and GCSs globin domain and tHbs fold into the 3/3- and 2/2-folding, respectively. Spectra from B. japonicum USDA38 and USDA58 soluble extracts exhibited peaks and troughs characteristic of bacterial and vertebrate Glbs thus indicating that putative Glbs are synthesized in B. japonicum USDA38 and USDA58.

  2. BIOELECTRICAL IMPEDANCE VECTOR ANALYSIS IDENTIFIES SARCOPENIA IN NURSING HOME RESIDENTS

    Science.gov (United States)

    Loss of muscle mass and water shifts between body compartments are contributing factors to frailty in the elderly. The body composition changes are especially pronounced in institutionalized elderly. We investigated the ability of single-frequency bioelectrical impedance analysis (BIA) to identify b...

  3. Survey of MapReduce frame operation in bioinformatics.

    Science.gov (United States)

    Zou, Quan; Li, Xu-Bin; Jiang, Wen-Rui; Lin, Zi-Yu; Li, Gui-Lin; Chen, Ke

    2014-07-01

    Bioinformatics is challenged by the fact that traditional analysis tools have difficulty in processing large-scale data from high-throughput sequencing. The open source Apache Hadoop project, which adopts the MapReduce framework and a distributed file system, has recently given bioinformatics researchers an opportunity to achieve scalable, efficient and reliable computing performance on Linux clusters and on cloud computing services. In this article, we present MapReduce frame-based applications that can be employed in the next-generation sequencing and other biological domains. In addition, we discuss the challenges faced by this field as well as the future works on parallel computing in bioinformatics.

  4. Survey of MapReduce frame operation in bioinformatics.

    Science.gov (United States)

    Zou, Quan; Li, Xu-Bin; Jiang, Wen-Rui; Lin, Zi-Yu; Li, Gui-Lin; Chen, Ke

    2014-07-01

    Bioinformatics is challenged by the fact that traditional analysis tools have difficulty in processing large-scale data from high-throughput sequencing. The open source Apache Hadoop project, which adopts the MapReduce framework and a distributed file system, has recently given bioinformatics researchers an opportunity to achieve scalable, efficient and reliable computing performance on Linux clusters and on cloud computing services. In this article, we present MapReduce frame-based applications that can be employed in the next-generation sequencing and other biological domains. In addition, we discuss the challenges faced by this field as well as the future works on parallel computing in bioinformatics. PMID:23396756

  5. A Survey of Scholarly Literature Describing the Field of Bioinformatics Education and Bioinformatics Educational Research

    Science.gov (United States)

    Magana, Alejandra J.; Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the…

  6. Bioinformatics and Molecular Analysis of the Evolutionary Relationship between Bovine Rhinitis A Viruses and Foot-And-Mouth Disease Virus

    OpenAIRE

    Rai, Devendra K.; Paul Lawrence; Steve J. Pauszek; Piccone, Maria E.; Knowles, Nick J.; Elizabeth Rieder

    2016-01-01

    Bovine rhinitis viruses (BRVs) cause mild respiratory disease of cattle. In this study, a near full-length genome sequence of a virus named RS3X (formerly classified as bovine rhinovirus type 1), isolated from infected cattle from the UK in the 1960s, was obtained and analyzed. Compared to other closely related Aphthoviruses, major differences were detected in the leader protease (Lpro), P1, 2B, and 3A proteins. Phylogenetic analysis revealed that RS3X was a member of the species bovine rhini...

  7. Adapting bioinformatics curricula for big data.

    Science.gov (United States)

    Greene, Anna C; Giffin, Kristine A; Greene, Casey S; Moore, Jason H

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs. PMID:25829469

  8. Statistical approach, Sensory analysis, brief application of Bioinformatics Tool, Melanin, Allicin and Glucosinolate presence in Mango pulp for Pharmacological Benefits

    Directory of Open Access Journals (Sweden)

    Saranya Chitturi

    2013-06-01

    Full Text Available Information on important flavor components for fruit and vegetables is lacking and would be useful for breeders and molecular biologists . In this study five acid treatments, were formulated and the effects of Citric Acid (CA and Malic Acid (MA levels on canned mango pulp (Mangifera indica L. flavor perception was evaluated . Depiction of pulp components was executed in the Rasmol V 2 7.1 visualizing pectin, melanin and allinase compounds as a part of brief bioformatic analysis of the pulp. Melanin content, allicin and glucosinolate’s presence were assessed and their % concentration variations against different treatments was depicted . As we correlated the values of TSS and pH by different statistical analysis methods like Pearson’s correlation coefficient, Spearman’s and Regression plots by a statistical software we found that these two variables are positively correlated to each other. We have the alternate hypothesis H1 with p value < 0.05 being accepted for the sensory quality estimation based on Larmond’s 9-point hedonic scale sensory evaluation. The lowest levels of allicin was found in T2 about 0.14% where as the highest was noted to be about 4.28% in T3. The T5 treatment showed low concentration of melanin about 3.98% and the highest was about 9.43% in T4.The glucosinolate concentrations also varied according to the treatment administered. Low level of about 3.34% in T3 and about 7.9% concentration was observed in T4 . All these findings can further invariably help in extending the shelf life and increasing the marketability of the mango based products

  9. VLSI Microsystem for Rapid Bioinformatic Pattern Recognition

    Science.gov (United States)

    Fang, Wai-Chi; Lue, Jaw-Chyng

    2009-01-01

    A system comprising very-large-scale integrated (VLSI) circuits is being developed as a means of bioinformatics-oriented analysis and recognition of patterns of fluorescence generated in a microarray in an advanced, highly miniaturized, portable genetic-expression-assay instrument. Such an instrument implements an on-chip combination of polymerase chain reactions and electrochemical transduction for amplification and detection of deoxyribonucleic acid (DNA).

  10. Screening feature genes of astrocytoma using a combined method of microarray gene expression profiling and bioinformatics analysis.

    Science.gov (United States)

    Cai, Yong; Zhong, Xingming; Wang, Yiqi; Yang, Jianguo

    2015-01-01

    The aim of our study was to find feature genes associated with astrocytoma and correlative gene functions which can distinguish cancer tissue from adjacent non-tumor astrocyte tissues. Gene expression profile GSE15824 was downloaded from Gene Expression Omnibus database which included 8 astrocytoma tissues and 3 adjacent non-tumor astrocyte samples. The raw data were first transformed into probe-level data and the differentially expressed genes (DEGs) between tissues of patients with astrocytoma and normal specimen were identified using T-test in samr package of R. The Database for Annotation, Visualization and Integrated Discovery (DAVID) was applied to analyze the gene ontology (GO) enrichment on gene functions and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Finally, corresponding protein-protein interaction (PPI) networks of DEGs was constructed using the Cytoscape based on the data collected from STRING online datasets. A total of 3072 genes, including 1799 up-regulated genes and 1273 down-regulated genes, were filtered as DEGs, and we learnt that the DEGs including AQP4, PMP2, SRARCL1 and SLC1A2CAMs etc and that AQP4 was most significantly related to cell osmotic pressure. Three feature genes in KEGG pathway are highly enriched in cancer specimen while two genes are in the normal tissues. The discovery of featured genes significantly related to the regulation of cell osmotic pressure, has the potential to use in clinic for diagnosis of astrocytoma in future. In addition, it has a great significance on studying mechanism, distinguishing normal and cancer tissues, and exploring new treatments for astrocytoma. However, further experiments were needed to confirm our result. PMID:26770395

  11. Bioinformatic analysis of ESTs collected by Sanger and pyrosequencing methods for a keystone forest tree species: oak

    Directory of Open Access Journals (Sweden)

    Léger Patrick

    2010-11-01

    traits. Comparative orthologous sequences (COS with other plant gene models were identified and allow to unravel the oak paleo-history. Simple sequence repeats (SSRs and single nucleotide polymorphisms (SNPs were searched, resulting in 52,834 SSRs and 36,411 SNPs. All of these are available through the Oak Contig Browser http://genotoul-contigbrowser.toulouse.inra.fr:9092/Quercus_robur/index.html. Conclusions This genomic resource provides a unique tool to discover genes of interest, study the oak transcriptome, and develop new markers to investigate functional diversity in natural populations.

  12. Bioinformatics Approach in Plant Genomic Research.

    Science.gov (United States)

    Ong, Quang; Nguyen, Phuc; Thao, Nguyen Phuong; Le, Ly

    2016-08-01

    The advance in genomics technology leads to the dramatic change in plant biology research. Plant biologists now easily access to enormous genomic data to deeply study plant high-density genetic variation at molecular level. Therefore, fully understanding and well manipulating bioinformatics tools to manage and analyze these data are essential in current plant genome research. Many plant genome databases have been established and continued expanding recently. Meanwhile, analytical methods based on bioinformatics are also well developed in many aspects of plant genomic research including comparative genomic analysis, phylogenomics and evolutionary analysis, and genome-wide association study. However, constantly upgrading in computational infrastructures, such as high capacity data storage and high performing analysis software, is the real challenge for plant genome research. This review paper focuses on challenges and opportunities which knowledge and skills in bioinformatics can bring to plant scientists in present plant genomics era as well as future aspects in critical need for effective tools to facilitate the translation of knowledge from new sequencing data to enhancement of plant productivity. PMID:27499685

  13. Residue analysis of a CTL epitope of SARS-CoV spike protein by IFN-gamma production and bioinformatics prediction

    Directory of Open Access Journals (Sweden)

    Huang Jun

    2012-09-01

    Full Text Available Abstract Background Severe acute respiratory syndrome (SARS is an emerging infectious disease caused by the novel coronavirus SARS-CoV. The T cell epitopes of the SARS CoV spike protein are well known, but no systematic evaluation of the functional and structural roles of each residue has been reported for these antigenic epitopes. Analysis of the functional importance of side-chains by mutational study may exaggerate the effect by imposing a structural disturbance or an unusual steric, electrostatic or hydrophobic interaction. Results We demonstrated that N50 could induce significant IFN-gamma response from SARS-CoV S DNA immunized mice splenocytes by the means of ELISA, ELISPOT and FACS. Moreover, S366-374 was predicted to be an optimal epitope by bioinformatics tools: ANN, SMM, ARB and BIMAS, and confirmed by IFN-gamma response induced by a series of S358-374-derived peptides. Furthermore, each of S366-374 was replaced by alanine (A, lysine (K or aspartic acid (D, respectively. ANN was used to estimate the binding affinity of single S366-374 mutants to H-2 Kd. Y367 and L374 were predicated to possess the most important role in peptide binding. Additionally, these one residue mutated peptides were synthesized, and IFN-gamma production induced by G368, V369, A371, T372 and K373 mutated S366-374 were decreased obviously. Conclusions We demonstrated that S366-374 is an optimal H-2 Kd CTL epitope in the SARS CoV S protein. Moreover, Y367, S370, and L374 are anchors in the epitope, while C366, G368, V369, A371, T372, and K373 may directly interact with TCR on the surface of CD8-T cells.

  14. Bioinformatic Analysis of FoxA1 Protein%FoxA1蛋白的生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    赵小峰; 葛保健

    2012-01-01

    Transcription factor FoxAl is a 'pioneer' factor that binds to chromatinized DNA and regulates cell signaling and cell cycle. High expression of FoxAl has been reported in various tumors, and it maybe a potential therapeutic target of the cancer. This study aimed to obtain more information about FoxAl. The structures and functions, protein interaction network, multiple sequence alignment were analyzed with software tools and database. We obtained more biological information about FoxAl protein by bioinformatic analysis, which is very useful for further research.%转录因子FoxA1通过与染色体结合释放出DNA结合位点对信号转导和细胞增殖进行调控.研究发现多种肿瘤组织中FoxA1表达上调,参与肿瘤生长调控,揭示FoxA1有可能成为新的肿瘤治疗靶点.该研究采用生物信息学方法,在获得FoxA1基因和蛋白序列的基础上,对其结构、性质以及与其有相互作用的蛋白进行初步的生物信息学分析,以期为进一步研究FoxA1的生物学特性奠定基础.

  15. Application Of Data Mining In Bioinformatics

    OpenAIRE

    KHALID RAZA

    2012-01-01

    This article highlights some of the basic concepts of bioinformatics and data mining. The major research areas of bioinformatics are highlighted. The application of data mining in the domain of bioinformatics is explained. It also highlights some of the current challenges and opportunities of data mining in bioinformatics.

  16. Use of Photogrammetry and Biomechanical Gait analysis to Identify Individuals

    DEFF Research Database (Denmark)

    Larsen, Peter Kastmand; Simonsen, Erik Bruun; Lynnerup, Niels

    found. Especially the variables located in the frontal plane are interesting due to large inter-individual differences in time course patterns. The variables with high recognition rates seem preferable for use in forensic gait analysis and as input variables to waveform analysis techniques...... such as principal component analysis resulting in marginal scores, which are difficult to interpret individually. Finally, a new gait model is presented based on functional principal component analysis with potentials for detecting individual gait patterns where time course patterns can be marginally interpreted......Photogrammetry and recognition of gait patterns are valuable tools to help identify perpetrators based on surveillance recordings. We have found that stature but only few other measures have a satisfying reproducibility for use in forensics. Several gait variables with high recognition rates were...

  17. Latent cluster analysis of ALS phenotypes identifies prognostically differing groups.

    Directory of Open Access Journals (Sweden)

    Jeban Ganesalingam

    Full Text Available BACKGROUND: Amyotrophic lateral sclerosis (ALS is a degenerative disease predominantly affecting motor neurons and manifesting as several different phenotypes. Whether these phenotypes correspond to different underlying disease processes is unknown. We used latent cluster analysis to identify groupings of clinical variables in an objective and unbiased way to improve phenotyping for clinical and research purposes. METHODS: Latent class cluster analysis was applied to a large database consisting of 1467 records of people with ALS, using discrete variables which can be readily determined at the first clinic appointment. The model was tested for clinical relevance by survival analysis of the phenotypic groupings using the Kaplan-Meier method. RESULTS: The best model generated five distinct phenotypic classes that strongly predicted survival (p<0.0001. Eight variables were used for the latent class analysis, but a good estimate of the classification could be obtained using just two variables: site of first symptoms (bulbar or limb and time from symptom onset to diagnosis (p<0.00001. CONCLUSION: The five phenotypic classes identified using latent cluster analysis can predict prognosis. They could be used to stratify patients recruited into clinical trials and generating more homogeneous disease groups for genetic, proteomic and risk factor research.

  18. Parameter Trajectory Analysis to Identify Treatment Effects of Pharmacological Interventions

    OpenAIRE

    Tiemann, Christian A.; Vanlier, Joep; Oosterveer, Maaike H.; Albert K Groen; Hilbers, Peter A. J.; Natal A W van Riel

    2013-01-01

    The field of medical systems biology aims to advance understanding of molecular mechanisms that drive disease progression and to translate this knowledge into therapies to effectively treat diseases. A challenging task is the investigation of long-term effects of a (pharmacological) treatment, to establish its applicability and to identify potential side effects. We present a new modeling approach, called Analysis of Dynamic Adaptations in Parameter Trajectories (ADAPT), to analyze the long-t...

  19. SNPTrackTM : an integrated bioinformatics system for genetic association studies

    Directory of Open Access Journals (Sweden)

    Xu Joshua

    2012-07-01

    Full Text Available Abstract A genetic association study is a complicated process that involves collecting phenotypic data, generating genotypic data, analyzing associations between genotypic and phenotypic data, and interpreting genetic biomarkers identified. SNPTrack is an integrated bioinformatics system developed by the US Food and Drug Administration (FDA to support the review and analysis of pharmacogenetics data resulting from FDA research or submitted by sponsors. The system integrates data management, analysis, and interpretation in a single platform for genetic association studies. Specifically, it stores genotyping data and single-nucleotide polymorphism (SNP annotations along with study design data in an Oracle database. It also integrates popular genetic analysis tools, such as PLINK and Haploview. SNPTrack provides genetic analysis capabilities and captures analysis results in its database as SNP lists that can be cross-linked for biological interpretation to gene/protein annotations, Gene Ontology, and pathway analysis data. With SNPTrack, users can do the entire stream of bioinformatics jobs for genetic association studies. SNPTrack is freely available to the public at http://www.fda.gov/ScienceResearch/BioinformaticsTools/SNPTrack/default.htm.

  20. Bioinformatic Analysis of the Nitrate Reductase Gene in Antartic Ice Algae Chlamydomonas sp. ICE-L%南极衣藻Chlamydomonas sp.ICE-L硝酸还原酶基因的生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    林敏卓; 刘晨临; 黄晓航; 杨平平

    2012-01-01

    Nitrate reductase (NR) plays an important role in the abiotic stress adaptation in plants by regulating nitrogen metabolism. A nitrate reductase (NR) gene of Antarctic ice algae, Chlamydomonas sp. ICE-L, was identified from the cDNA library and sequenced. The encoded protein sequence of NR gene was investigated by bioinformatic analysis. Through sequence alignment the active sites of ICE-L NR protein sequence which may related to stress acclimation was identified. In addition, the tertiary structure of ICE- L NR protein sequence was predicted. The full-length of Chlamydomonas ICE-L NR gene contained an open reading frame of 2,589 bp encoding a nitrate reductase of 863 amino acids. Phylogenetic analysis showed that the gene was homologous to known green algae NRs with identity of 63%, 61%, 60% and 54% to Volvox carteri, Chlamydomonas reinhardtii, Dunaliella tertiolecta and Chlorella vulgaris respectively. The functional prediction analysis revealed that NR gene sequence has 3 different functional domains which was similar to higher plant. This bioinformatic analysis about NR gene of ICE- L will help us further understand and deeply expand the recearch on the acclimatizing mechanism of Antarctic ice alga Chlamydomonas in the extreme environment from the angle of NR gene.%硝酸还原酶(NR)除调节植物的氮代谢外,在植物的各种非生物胁迫的适应过程中也发挥着重要的作用.从南极冰藻Chlamydomonas sp.ICE-L的cDNA文库中筛选到了硝酸还原酶的全长基因,对其进行测序并对其编码的蛋白序列进行了生物信息学分析,构建了NR的系统进化树,通过多序列比对探讨了可能与该酶逆境适应性相关的活性位点,并对该蛋白进行了三级结构预测分析.结果显示,NR基因的编码区长2 589 bp,编码863个氨基酸.在以氨基酸序列构建的系统进化树中,南极衣藻的NR序列和其他绿藻类的聚在一起,与团藻、莱茵衣藻、杜氏盐藻和小球藻

  1. Identifying Organizational Inefficiencies with Pictorial Process Analysis (PPA

    Directory of Open Access Journals (Sweden)

    David John Patrishkoff

    2013-11-01

    Full Text Available Pictorial Process Analysis (PPA was created by the author in 2004. PPA is a unique methodology which offers ten layers of additional analysis when compared to standard process mapping techniques.  The goal of PPA is to identify and eliminate waste, inefficiencies and risk in manufacturing or transactional business processes at 5 levels in an organization. The highest level being assessed is the process management, followed by the process work environment, detailed work habits, process performance metrics and general attitudes towards the process. This detailed process assessment and analysis is carried out during process improvement brainstorming efforts and Kaizen events. PPA creates a detailed visual efficiency rating for each step of the process under review.  A selection of 54 pictorial Inefficiency Icons (cards are available for use to highlight major inefficiencies and risks that are present in the business process under review. These inefficiency icons were identified during the author's independent research on the topic of why things go wrong in business. This paper will highlight how PPA was developed and show the steps required to conduct Pictorial Process Analysis on a sample manufacturing process. The author has successfully used PPA to dramatically improve business processes in over 55 different industries since 2004.  

  2. Rice Transcriptome Analysis to Identify Possible Herbicide Quinclorac Detoxification Genes

    Directory of Open Access Journals (Sweden)

    Wenying eXu

    2015-09-01

    Full Text Available Quinclorac is a highly selective auxin-type herbicide, and is widely used in the effective control of barnyard grass in paddy rice fields, improving the world’s rice yield. The herbicide mode of action of quinclorac has been proposed and hormone interactions affect quinclorac signaling. Because of widespread use, quinclorac may be transported outside rice fields with the drainage waters, leading to soil and water pollution and environmental health problems.In this study, we used 57K Affymetrix rice whole-genome array to identify quinclorac signaling response genes to study the molecular mechanisms of action and detoxification of quinclorac in rice plants. Overall, 637 probe sets were identified with differential expression levels under either 6 or 24 h of quinclorac treatment. Auxin-related genes such as GH3 and OsIAAs responded to quinclorac treatment. Gene Ontology analysis showed that genes of detoxification-related family genes were significantly enriched, including cytochrome P450, GST, UGT, and ABC and drug transporter genes. Moreover, real-time RT-PCR analysis showed that top candidate P450 families such as CYP81, CYP709C and CYP72A genes were universally induced by different herbicides. Some Arabidopsis genes for the same P450 family were up-regulated under quinclorac treatment.We conduct rice whole-genome GeneChip analysis and the first global identification of quinclorac response genes. This work may provide potential markers for detoxification of quinclorac and biomonitors of environmental chemical pollution.

  3. Proteogenomic Analysis Identifies a Novel Human SHANK3 Isoform

    Directory of Open Access Journals (Sweden)

    Fahad Benthani

    2015-05-01

    Full Text Available Mutations of the SHANK3 gene have been associated with autism spectrum disorder. Individuals harboring different SHANK3 mutations display considerable heterogeneity in their cognitive impairment, likely due to the high SHANK3 transcriptional diversity. In this study, we report a novel interaction between the Mutated in colorectal cancer (MCC protein and a newly identified SHANK3 protein isoform in human colon cancer cells and mouse brain tissue. Hence, our proteogenomic analysis identifies a new human long isoform of the key synaptic protein SHANK3 that was not predicted by the human reference genome. Taken together, our findings describe a potential new role for MCC in neurons, a new human SHANK3 long isoform and, importantly, highlight the use of proteomic data towards the re-annotation of GC-rich genomic regions.

  4. Parameter trajectory analysis to identify treatment effects of pharmacological interventions.

    Directory of Open Access Journals (Sweden)

    Christian A Tiemann

    Full Text Available The field of medical systems biology aims to advance understanding of molecular mechanisms that drive disease progression and to translate this knowledge into therapies to effectively treat diseases. A challenging task is the investigation of long-term effects of a (pharmacological treatment, to establish its applicability and to identify potential side effects. We present a new modeling approach, called Analysis of Dynamic Adaptations in Parameter Trajectories (ADAPT, to analyze the long-term effects of a pharmacological intervention. A concept of time-dependent evolution of model parameters is introduced to study the dynamics of molecular adaptations. The progression of these adaptations is predicted by identifying necessary dynamic changes in the model parameters to describe the transition between experimental data obtained during different stages of the treatment. The trajectories provide insight in the affected underlying biological systems and identify the molecular events that should be studied in more detail to unravel the mechanistic basis of treatment outcome. Modulating effects caused by interactions with the proteome and transcriptome levels, which are often less well understood, can be captured by the time-dependent descriptions of the parameters. ADAPT was employed to identify metabolic adaptations induced upon pharmacological activation of the liver X receptor (LXR, a potential drug target to treat or prevent atherosclerosis. The trajectories were investigated to study the cascade of adaptations. This provided a counter-intuitive insight concerning the function of scavenger receptor class B1 (SR-B1, a receptor that facilitates the hepatic uptake of cholesterol. Although activation of LXR promotes cholesterol efflux and -excretion, our computational analysis showed that the hepatic capacity to clear cholesterol was reduced upon prolonged treatment. This prediction was confirmed experimentally by immunoblotting measurements of SR-B1

  5. Bioinformatics tools for analysing viral genomic data.

    Science.gov (United States)

    Orton, R J; Gu, Q; Hughes, J; Maabar, M; Modha, S; Vattipally, S B; Wilkie, G S; Davison, A J

    2016-04-01

    The field of viral genomics and bioinformatics is experiencing a strong resurgence due to high-throughput sequencing (HTS) technology, which enables the rapid and cost-effective sequencing and subsequent assembly of large numbers of viral genomes. In addition, the unprecedented power of HTS technologies has enabled the analysis of intra-host viral diversity and quasispecies dynamics in relation to important biological questions on viral transmission, vaccine resistance and host jumping. HTS also enables the rapid identification of both known and potentially new viruses from field and clinical samples, thus adding new tools to the fields of viral discovery and metagenomics. Bioinformatics has been central to the rise of HTS applications because new algorithms and software tools are continually needed to process and analyse the large, complex datasets generated in this rapidly evolving area. In this paper, the authors give a brief overview of the main bioinformatics tools available for viral genomic research, with a particular emphasis on HTS technologies and their main applications. They summarise the major steps in various HTS analyses, starting with quality control of raw reads and encompassing activities ranging from consensus and de novo genome assembly to variant calling and metagenomics, as well as RNA sequencing.

  6. Bringing Web 2.0 to bioinformatics.

    Science.gov (United States)

    Zhang, Zhang; Cheung, Kei-Hoi; Townsend, Jeffrey P

    2009-01-01

    Enabling deft data integration from numerous, voluminous and heterogeneous data sources is a major bioinformatic challenge. Several approaches have been proposed to address this challenge, including data warehousing and federated databasing. Yet despite the rise of these approaches, integration of data from multiple sources remains problematic and toilsome. These two approaches follow a user-to-computer communication model for data exchange, and do not facilitate a broader concept of data sharing or collaboration among users. In this report, we discuss the potential of Web 2.0 technologies to transcend this model and enhance bioinformatics research. We propose a Web 2.0-based Scientific Social Community (SSC) model for the implementation of these technologies. By establishing a social, collective and collaborative platform for data creation, sharing and integration, we promote a web services-based pipeline featuring web services for computer-to-computer data exchange as users add value. This pipeline aims to simplify data integration and creation, to realize automatic analysis, and to facilitate reuse and sharing of data. SSC can foster collaboration and harness collective intelligence to create and discover new knowledge. In addition to its research potential, we also describe its potential role as an e-learning platform in education. We discuss lessons from information technology, predict the next generation of Web (Web 3.0), and describe its potential impact on the future of bioinformatics studies.

  7. Translational bioinformatics in psychoneuroimmunology: methods and applications.

    Science.gov (United States)

    Yan, Qing

    2012-01-01

    Translational bioinformatics plays an indispensable role in transforming psychoneuroimmunology (PNI) into personalized medicine. It provides a powerful method to bridge the gaps between various knowledge domains in PNI and systems biology. Translational bioinformatics methods at various systems levels can facilitate pattern recognition, and expedite and validate the discovery of systemic biomarkers to allow their incorporation into clinical trials and outcome assessments. Analysis of the correlations between genotypes and phenotypes including the behavioral-based profiles will contribute to the transition from the disease-based medicine to human-centered medicine. Translational bioinformatics would also enable the establishment of predictive models for patient responses to diseases, vaccines, and drugs. In PNI research, the development of systems biology models such as those of the neurons would play a critical role. Methods based on data integration, data mining, and knowledge representation are essential elements in building health information systems such as electronic health records and computerized decision support systems. Data integration of genes, pathophysiology, and behaviors are needed for a broad range of PNI studies. Knowledge discovery approaches such as network-based systems biology methods are valuable in studying the cross-talks among pathways in various brain regions involved in disorders such as Alzheimer's disease.

  8. Comparison of Online and Onsite Bioinformatics Instruction for a Fully Online Bioinformatics Master’s Program

    OpenAIRE

    Obom, Kristina. M.; Cummings, Patrick J.

    2009-01-01

    The completely online Master of Science in Bioinformatics program differs from the onsite program only in the mode of content delivery. Analysis of student satisfaction indicates no statistically significant difference between most online and onsite student responses, however, online and onsite students do differ significantly in their responses to a few questions on the course evaluation queries. Analysis of student exam performance using three assessments indicates that there was no signifi...

  9. Identifying clinical course patterns in SMS data using cluster analysis

    DEFF Research Database (Denmark)

    Kent, Peter; Kongsted, Alice

    2012-01-01

    ABSTRACT: BACKGROUND: Recently, there has been interest in using the short message service (SMS or text messaging), to gather frequent information on the clinical course of individual patients. One possible role for identifying clinical course patterns is to assist in exploring clinically important...... clinically interpretable and different from those of the whole group. Similar patterns were obtained when the number of SMS time points was reduced to monthly. The advantages and disadvantages of this method were contrasted to that of first transforming SMS data by spline analysis. CONCLUSIONS: This study...

  10. 儿童噬血细胞性淋巴组织细胞增生症发病机制的生物信息学研究%Mechanisms of childhood hemophagocytic lymphohistiocytosis:A bioinformatic analysis

    Institute of Scientific and Technical Information of China (English)

    欧丹艳; 袁媛; 罗建明

    2014-01-01

    Objective Hemophagocytic lymphohistiocytosis (HLH) is a life-threatening condition characterized by excessive inflammation, with a high incidence in children and a death rate of 40%.This study was to analyze the gene expression profile in child-hood HLH and explore the important pathways of childhood HLH using bioinformatic methods . Methods The childhood HLH gene ex-pression profile data GSE26050 were obtained from the Gene Expression Omnibus (GEO) database of the National Center for Biotechnolo-gy Information.Differentially expressed genes were identified with the GEO 2R online analysis tools released recently .The key pathways of the differentially expressed genes were investigated using the Kyoto Encyclopedia of Genes and Genomes ( KEGG) pathway enrichment a-nalysis. Results A total of 184 differentially expressed genes were identified , 126 upregulated and the other 58 downregulated .They were enriched in 3 pathways, including cytokine-cytokine receptor interaction , hematopoietic cell lineage and NOD-like receptor signaling pathways. Conclusion Bioinformatic tools allow the identification of the key genes and pathways associated with the development and progression of childhood HLH and point out the potential directions for researches on the mechanisms of childhood HLH .%目的:噬血细胞性淋巴组织细胞增生症( hemophagocytic lymphohistiocytosis , HLH)是一种致命的过度炎症性疾病,好发于儿童,死亡率高达40%。为进一步了解儿童HLH的发病机制,研究利用生物信息学方法筛选儿童HLH相关基因,并进行通路富集分析。方法从GEO数据库中获得儿童HLH 外周血单个核细胞基因表达谱数据集GSE26050,利用GEO2R在线分析工具筛选差异表达基因,随后使用DAVID数据库的KEGG通路富集方法对其进行分析。结果筛选出表达差异2倍及以上的基因184个,其中126个为上调基因、58个为下调基因;富集出3条通路:细胞因子-细胞

  11. Lidar point density analysis: implications for identifying water bodies

    Science.gov (United States)

    Worstell, Bruce B.; Poppenga, Sandra; Evans, Gayla A.; Prince, Sandra

    2014-01-01

    Most airborne topographic light detection and ranging (lidar) systems operate within the near-infrared spectrum. Laser pulses from these systems frequently are absorbed by water and therefore do not generate reflected returns on water bodies in the resulting void regions within the lidar point cloud. Thus, an analysis of lidar voids has implications for identifying water bodies. Data analysis techniques to detect reduced lidar return densities were evaluated for test sites in Blackhawk County, Iowa, and Beltrami County, Minnesota, to delineate contiguous areas that have few or no lidar returns. Results from this study indicated a 5-meter radius moving window with fewer than 23 returns (28 percent of the moving window) was sufficient for delineating void regions. Techniques to provide elevation values for void regions to flatten water features and to force channel flow in the downstream direction also are presented.

  12. Bioinformatics Analysis of SAUR Gene Family in Brassica rapa%白菜SAUR基因家族的生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    赵敬会; 王瑞雪; 李荣冲; 梁晶龙; 张涛

    2012-01-01

    The aim was to lay the foundation for the function of auxin response genes in the SAUR gene family in the future. The conserved motif, isoelectric point of amino acids, molecular evolution, expression pattern and other basic properties of early auxin responsive gene SAUR family were analyzed by the bioinformatics method on Brassica rapa. The results showed that there were 111 alkaline proteins, 31 acidic proteins, and one neutral protein. The SAUR gene family divided into two major subfamilies from the phylogenetic tree, one of the subfamily was still in the continued differentiation. 61 SAUR genes were found EST evidence and the genes expression was rich in sites. Through research and analysis that gene duplication of the SAUR gene family was a major characteristic, 143 genes contained 51 homologous pairs.%旨在为今后开展白菜SAUR基因家族的功能奠定基础.本研究利用生物信息学的方法对白菜生长素早期应答基因SAUR(Small auxin-up RNA)家族的保守基序、氨基酸等电点、分子进化以及表达模式等基本性质进行了分析.结果表明,白菜143个SAUR蛋白质中,有111个蛋白质偏碱性,31个偏酸性,有1个呈中性;从对该基因家族构建的进化树中可以看出它分化出2个大的亚家族,其中一个亚家族还处在不断分化中;此外,找到61个SAUR基因的EST表达证据,且基因表达部位比较丰富.通过研究分析表明本研究可以得出结论基因重复是白菜SAUR基因家族的一大特点,143个基因中含有51个同源对.

  13. 玉米谷胱甘肽过氧化物酶生物信息学分析%Bioinformatics Analysis of Glutathione Peroxidase in Zea Mays

    Institute of Scientific and Technical Information of China (English)

    张媛; 张钟仁; 咸丽霞; 邢国芳

    2013-01-01

    谷胱甘肽过氧化物酶(GPX)是生物体内重要的活性氧自由基清除剂,它能够清除生物体内的过氧化氢和脂质过氧化物,阻断活性氧自由基对机体的进一步损伤,保证生物体能正常进行生命活动.以玉米谷胱甘肽过氧化物酶基因家族的11个成员为研究对象,对其编码的蛋白质的结构和功能进行分析,包括等电点、分子量、亲水性值、二级结构和亚细胞定位等,并建立了分子系统进化树.结果发现,玉米谷胱甘肽过氧化物酶基因家族的11个成员的等电点和相对分子量存在差异,而二级结构存在相似特征,其中,二级结构包括α-螺旋、β-折叠、β-转角和无规则卷曲.以上分析为全面解析玉米谷胱甘肽过氧化物酶的功能奠定了基础,并可为植物抵御氧化胁迫研究提供理论依据.%The GPX (Glutathione Peroxidase) is important active oxygen free radical scavengers in biosome,which can remove hydrogen peroxide and lipid peroxides,block active oxygen free radical to damage the body,and ensure the normal biological activities.In this study,the structure and function of GPX family genes encoding protein in Zea Mays such as isoelectric point,molecular weight,the number of amino acids,hydrophilic property,secondary structure and subcellular localization were analyzed,and the phylogenetic tree was built by a series of bioinformatics software.The results showed that:the diversity characteristic of isoelectric point and molecular weight was observed among these GPX genes,and the similar characteristics such as secondary structure was observed.The secondary structure included α-helix,β-sheet,β-turn and random coil.The above results lay a foundation for comprehensive analysis of GPX in Zea Mays and provide theoretical basis for the resisting oxidative stress.

  14. Generations of interdisciplinarity in bioinformatics

    Science.gov (United States)

    Bartlett, Andrew; Lewis, Jamie; Williams, Matthew L.

    2016-01-01

    Bioinformatics, a specialism propelled into relevance by the Human Genome Project and the subsequent -omic turn in the life science, is an interdisciplinary field of research. Qualitative work on the disciplinary identities of bioinformaticians has revealed the tensions involved in work in this “borderland.” As part of our ongoing work on the emergence of bioinformatics, between 2010 and 2011, we conducted a survey of United Kingdom-based academic bioinformaticians. Building on insights drawn from our fieldwork over the past decade, we present results from this survey relevant to a discussion of disciplinary generation and stabilization. Not only is there evidence of an attitudinal divide between the different disciplinary cultures that make up bioinformatics, but there are distinctions between the forerunners, founders and the followers; as inter/disciplines mature, they face challenges that are both inter-disciplinary and inter-generational in nature. PMID:27453689

  15. Cluster analysis of clinical data identifies fibromyalgia subgroups.

    Directory of Open Access Journals (Sweden)

    Elisa Docampo

    Full Text Available INTRODUCTION: Fibromyalgia (FM is mainly characterized by widespread pain and multiple accompanying symptoms, which hinder FM assessment and management. In order to reduce FM heterogeneity we classified clinical data into simplified dimensions that were used to define FM subgroups. MATERIAL AND METHODS: 48 variables were evaluated in 1,446 Spanish FM cases fulfilling 1990 ACR FM criteria. A partitioning analysis was performed to find groups of variables similar to each other. Similarities between variables were identified and the variables were grouped into dimensions. This was performed in a subset of 559 patients, and cross-validated in the remaining 887 patients. For each sample and dimension, a composite index was obtained based on the weights of the variables included in the dimension. Finally, a clustering procedure was applied to the indexes, resulting in FM subgroups. RESULTS: VARIABLES CLUSTERED INTO THREE INDEPENDENT DIMENSIONS: "symptomatology", "comorbidities" and "clinical scales". Only the two first dimensions were considered for the construction of FM subgroups. Resulting scores classified FM samples into three subgroups: low symptomatology and comorbidities (Cluster 1, high symptomatology and comorbidities (Cluster 2, and high symptomatology but low comorbidities (Cluster 3, showing differences in measures of disease severity. CONCLUSIONS: We have identified three subgroups of FM samples in a large cohort of FM by clustering clinical data. Our analysis stresses the importance of family and personal history of FM comorbidities. Also, the resulting patient clusters could indicate different forms of the disease, relevant to future research, and might have an impact on clinical assessment.

  16. Use of discriminant analysis to identify propensity for purchasing properties

    Directory of Open Access Journals (Sweden)

    Ricardo Floriani

    2015-03-01

    Full Text Available Properties usually represent a milestone for people and families due to the high added-value when compared with family income. The objective of this study is the proposition of a discrimination model, by a discriminant analysis of people with characteristics (according to independent variables classified as potential buyers of properties, as well as to identify the interest in the use of such property, if it will be assigned to housing or leisure activities such as a cottage or beach house, and/or for investment. Thus, the following research question is proposed: What are the characteristics that better describe the profile of people which intend to acquire properties? The study justifies itself by its economic relevance in the real estate industry, as well as to the players of the real estate Market that may develop products based on the profile of potential customers. As a statistical technique, discriminant analysis was applied to the data gathered by questionnaire, which was sent via e-mail. Three hundred and thirty four responses were gathered. Based on this study, it was observed that it is possible to identify the intention for acquired properties, as well the purpose for acquiring it, for housing or investments.

  17. Longitudinal Metagenomic Analysis of Hospital Air Identifies Clinically Relevant Microbes

    Science.gov (United States)

    King, Paula; Pham, Long K.; Waltz, Shannon; Sphar, Dan; Yamamoto, Robert T.; Conrad, Douglas; Taplitz, Randy; Torriani, Francesca

    2016-01-01

    We describe the sampling of sixty-three uncultured hospital air samples collected over a six-month period and analysis using shotgun metagenomic sequencing. Our primary goals were to determine the longitudinal metagenomic variability of this environment, identify and characterize genomes of potential pathogens and determine whether they are atypical to the hospital airborne metagenome. Air samples were collected from eight locations which included patient wards, the main lobby and outside. The resulting DNA libraries produced 972 million sequences representing 51 gigabases. Hierarchical clustering of samples by the most abundant 50 microbial orders generated three major nodes which primarily clustered by type of location. Because the indoor locations were longitudinally consistent, episodic relative increases in microbial genomic signatures related to the opportunistic pathogens Aspergillus, Penicillium and Stenotrophomonas were identified as outliers at specific locations. Further analysis of microbial reads specific for Stenotrophomonas maltophilia indicated homology to a sequenced multi-drug resistant clinical strain and we observed broad sequence coverage of resistance genes. We demonstrate that a shotgun metagenomic sequencing approach can be used to characterize the resistance determinants of pathogen genomes that are uncharacteristic for an otherwise consistent hospital air microbial metagenomic profile. PMID:27482891

  18. Robust Bioinformatics Recognition with VLSI Biochip Microsystem

    Science.gov (United States)

    Lue, Jaw-Chyng L.; Fang, Wai-Chi

    2006-01-01

    A microsystem architecture for real-time, on-site, robust bioinformatic patterns recognition and analysis has been proposed. This system is compatible with on-chip DNA analysis means such as polymerase chain reaction (PCR)amplification. A corresponding novel artificial neural network (ANN) learning algorithm using new sigmoid-logarithmic transfer function based on error backpropagation (EBP) algorithm is invented. Our results show the trained new ANN can recognize low fluorescence patterns better than the conventional sigmoidal ANN does. A differential logarithmic imaging chip is designed for calculating logarithm of relative intensities of fluorescence signals. The single-rail logarithmic circuit and a prototype ANN chip are designed, fabricated and characterized.

  19. An innovative approach for testing bioinformatics programs using metamorphic testing

    Directory of Open Access Journals (Sweden)

    Liu Huai

    2009-01-01

    Full Text Available Abstract Background Recent advances in experimental and computational technologies have fueled the development of many sophisticated bioinformatics programs. The correctness of such programs is crucial as incorrectly computed results may lead to wrong biological conclusion or misguide downstream experimentation. Common software testing procedures involve executing the target program with a set of test inputs and then verifying the correctness of the test outputs. However, due to the complexity of many bioinformatics programs, it is often difficult to verify the correctness of the test outputs. Therefore our ability to perform systematic software testing is greatly hindered. Results We propose to use a novel software testing technique, metamorphic testing (MT, to test a range of bioinformatics programs. Instead of requiring a mechanism to verify whether an individual test output is correct, the MT technique verifies whether a pair of test outputs conform to a set of domain specific properties, called metamorphic relations (MRs, thus greatly increases the number and variety of test cases that can be applied. To demonstrate how MT is used in practice, we applied MT to test two open-source bioinformatics programs, namely GNLab and SeqMap. In particular we show that MT is simple to implement, and is effective in detecting faults in a real-life program and some artificially fault-seeded programs. Further, we discuss how MT can be applied to test programs from various domains of bioinformatics. Conclusion This paper describes the application of a simple, effective and automated technique to systematically test a range of bioinformatics programs. We show how MT can be implemented in practice through two real-life case studies. Since many bioinformatics programs, particularly those for large scale simulation and data analysis, are hard to test systematically, their developers may benefit from using MT as part of the testing strategy. Therefore our work

  20. Training Experimental Biologists in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Pedro Fernandes

    2012-01-01

    Full Text Available Bioinformatics, for its very nature, is devoted to a set of targets that constantly evolve. Training is probably the best response to the constant need for the acquisition of bioinformatics skills. It is interesting to assess the effects of training in the different sets of researchers that make use of it. While training bench experimentalists in the life sciences, we have observed instances of changes in their attitudes in research that, if well exploited, can have beneficial impacts in the dialogue with professional bioinformaticians and influence the conduction of the research itself.

  1. A tyrosine-rich cell surface protein in the diatom Amphora coffeaeformis identified through transcriptome analysis and genetic transformation.

    Directory of Open Access Journals (Sweden)

    Matthias T Buhmann

    Full Text Available Diatoms are single-celled eukaryotic microalgae that are ubiquitously found in almost all aquatic ecosystems, and are characterized by their intricately structured SiO2 (silica-based cell walls. Diatoms with a benthic life style are capable of attaching to any natural or man-made submerged surface, thus contributing substantially to both microbial biofilm communities and economic losses through biofouling. Surface attachment of diatoms is mediated by a carbohydrate- and protein- based glue, yet no protein involved in diatom underwater adhesion has been identified so far. In the present work, we have generated a normalized transcriptome database from the model adhesion diatom Amphora coffeaeformis. Using an unconventional bioinformatics analysis we have identified five proteins that exhibit unique amino acid sequences resembling the amino acid composition of the tyrosine-rich adhesion proteins from mussel footpads. Establishing the first method for the molecular genetic transformation of A. coffeaeformis has enabled investigations into the function of one of these proteins, AC3362, through expression as YFP fusion protein. Biochemical analysis and imaging by fluorescence microscopy revealed that AC3362 is not involved in adhesion, but rather plays a role in biosynthesis and/or structural stability of the cell wall. The methods established in the present study have paved the way for further molecular studies on the mechanisms of underwater adhesion and biological silica formation in the diatom A. coffeaeformis.

  2. A Critical Analysis of Anesthesiology Podcasts: Identifying Determinants of Success

    Science.gov (United States)

    Singh, Devin; Matava, Clyde

    2016-01-01

    Background Audio and video podcasts have gained popularity in recent years. Increasingly, podcasts are being used in the field of medicine as a tool to disseminate information. This format has multiple advantages including highly accessible creation tools, low distribution costs, and portability for the user. However, despite its ongoing use in medical education, there are no data describing factors associated with the success or quality of podcasts. Objective The goal of the study was to assess the landscape of anesthesia podcasts in Canada and develop a methodology for evaluating the quality of the podcast. To achieve our objective, we identified the scope of podcasts in anesthesia specifically, constructed an algorithmic model for measuring success, and identified factors linked to both successful podcasts and a peer-review process. Methods Independent reviewers performed a systematic search of anesthesia-related podcasts on iTunes Canada. Data and metrics recorded for each podcast included podcast’s authorship, number posted, podcast series duration, target audience, topics, and social media presence. Descriptive statistics summarized mined data, and univariate analysis was used to identify factors associated with podcast success and a peer-review process. Results Twenty-two podcasts related to anesthesia were included in the final analysis. Less than a third (6/22=27%) were still active. The median longevity of the podcasts’ series was just 13 months (interquartile range: 1-39 months). Anesthesiologists were the target audience for 77% of podcast series with clinical topics being most commonly addressed. We defined a novel algorithm for measuring success: Podcast Success Index. Factors associated with a high Podcast Success Index included podcasts targeting fellows (Spearman R=0.434; P=.04), inclusion of professional topics (Spearman R=0.456-0.603; P=.01-.03), and the use of Twitter as a means of social media (Spearman R=0.453;P=.03). In addition, more

  3. BioRuby: Bioinformatics software for the Ruby programming language

    NARCIS (Netherlands)

    Goto, N.; Prins, J.C.P.; Nakao, M.; Bonnal, R.; Aerts, J.; Katayama, A.

    2010-01-01

    The BioRuby software toolkit contains a comprehensive set of free development tools and libraries for bioinformatics and molecular biology, written in the Ruby programming language. BioRuby has components for sequence analysis, pathway analysis, protein modelling and phylogenetic analysis; it suppor

  4. Comparison of Online and Onsite Bioinformatics Instruction for a Fully Online Bioinformatics Master’s Program

    Directory of Open Access Journals (Sweden)

    Kristina M. Obom

    2009-12-01

    Full Text Available The completely online Master of Science in Bioinformatics program differs from the onsite program only in the mode of content delivery. Analysis of student satisfaction indicates no statistically significant difference between most online and onsite student responses, however, online and onsite students do differ significantly in their responses to a few questions on the course evaluation queries. Analysis of student exam performance using three assessments indicates that there was no significant difference in grades earned by students in online and onsite courses. These results suggest that our model for online bioinformatics education provides students with a rigorous course of study that is comparable to onsite course instruction and possibly provides a more rigorous course load and more opportunities for participation.

  5. Analysis of an Image Secret Sharing Scheme to Identify Cheaters

    Directory of Open Access Journals (Sweden)

    Jung-San LEe

    2010-09-01

    Full Text Available Secret image sharing mechanisms have been widely applied to the military, e-commerce, and communications fields. Zhao et al. introduced the concept of cheater detection into image sharing schemes recently. This functionality enables the image owner and authorized members to identify the cheater in reconstructing the secret image. Here, we provide an analysis of Zhao et al.¡¦s method: an authorized participant is able to restore the secret image by him/herself. This contradicts the requirement of secret image sharing schemes. The authorized participant utilizes an exhaustive search to achieve the attempt, though, simulation results show that it can be done within a reasonable time period.

  6. Identifying avian sources of faecal contamination using sterol analysis.

    Science.gov (United States)

    Devane, Megan L; Wood, David; Chappell, Andrew; Robson, Beth; Webster-Brown, Jenny; Gilpin, Brent J

    2015-10-01

    Discrimination of the source of faecal pollution in water bodies is an important step in the assessment and mitigation of public health risk. One tool for faecal source tracking is the analysis of faecal sterols which are present in faeces of animals in a range of distinctive ratios. Published ratios are able to discriminate between human and herbivore mammal faecal inputs but are of less value for identifying pollution from wildfowl, which can be a common cause of elevated bacterial indicators in rivers and streams. In this study, the sterol profiles of 50 avian-derived faecal specimens (seagulls, ducks and chickens) were examined alongside those of 57 ruminant faeces and previously published sterol profiles of human wastewater, chicken effluent and animal meatwork effluent. Two novel sterol ratios were identified as specific to avian faecal scats, which, when incorporated into a decision tree with human and herbivore mammal indicative ratios, were able to identify sterols from avian-polluted waterways. For samples where the sterol profile was not consistent with herbivore mammal or human pollution, avian pollution is indicated when the ratio of 24-ethylcholestanol/(24-ethylcholestanol + 24-ethylcoprostanol + 24-ethylepicoprostanol) is ≥0.4 (avian ratio 1) and the ratio of cholestanol/(cholestanol + coprostanol + epicoprostanol) is ≥0.5 (avian ratio 2). When avian pollution is indicated, further confirmation by targeted PCR specific markers can be employed if greater confidence in the pollution source is required. A 66% concordance between sterol ratios and current avian PCR markers was achieved when 56 water samples from polluted waterways were analysed.

  7. Social network analysis in identifying influential webloggers: A preliminary study

    Science.gov (United States)

    Hasmuni, Noraini; Sulaiman, Nor Intan Saniah; Zaibidi, Nerda Zura

    2014-12-01

    In recent years, second generation of internet-based services such as weblog has become an effective communication tool to publish information on the Web. Weblogs have unique characteristics that deserve users' attention. Some of webloggers have seen weblogs as appropriate medium to initiate and expand business. These webloggers or also known as direct profit-oriented webloggers (DPOWs) communicate and share knowledge with each other through social interaction. However, survivability is the main issue among DPOW. Frequent communication with influential webloggers is one of the way to keep survive as DPOW. This paper aims to understand the network structure and identify influential webloggers within the network. Proper understanding of the network structure can assist us in knowing how the information is exchanged among members and enhance survivability among DPOW. 30 DPOW were involved in this study. Degree centrality and betweenness centrality measurement in Social Network Analysis (SNA) were used to examine the strength relation and identify influential webloggers within the network. Thus, webloggers with the highest value of these measurements are considered as the most influential webloggers in the network.

  8. Bioinformatics interoperability: all together now !

    NARCIS (Netherlands)

    Meganck, B.; Mergen, P.; Meirte, D.

    2009-01-01

    The following text presents some personal ideas about the way (bio)informatics2 is heading, along with some examples of how our institution – the Royal Museum for Central Africa (RMCA) – is gearing up for these new times ahead. It tries to find the important trends amongst the buzzwords, and to demo

  9. Reproducible Bioinformatics Research for Biologists

    Science.gov (United States)

    This book chapter describes the current Big Data problem in Bioinformatics and the resulting issues with performing reproducible computational research. The core of the chapter provides guidelines and summaries of current tools/techniques that a noncomputational researcher would need to learn to pe...

  10. Bioinformatic Identification of Conserved Cis-Sequences in Coregulated Genes.

    Science.gov (United States)

    Bülow, Lorenz; Hehl, Reinhard

    2016-01-01

    Bioinformatics tools can be employed to identify conserved cis-sequences in sets of coregulated plant genes because more and more gene expression and genomic sequence data become available. Knowledge on the specific cis-sequences, their enrichment and arrangement within promoters, facilitates the design of functional synthetic plant promoters that are responsive to specific stresses. The present chapter illustrates an example for the bioinformatic identification of conserved Arabidopsis thaliana cis-sequences enriched in drought stress-responsive genes. This workflow can be applied for the identification of cis-sequences in any sets of coregulated genes. The workflow includes detailed protocols to determine sets of coregulated genes, to extract the corresponding promoter sequences, and how to install and run a software package to identify overrepresented motifs. Further bioinformatic analyses that can be performed with the results are discussed. PMID:27557771

  11. Sequencing and bioinformatic analysis of genome of Acinetobacter baumannii bacteriophage AB3%鲍曼不动杆菌噬菌体AB3的全基因组测序及生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    张劼; 刘茜; 甘丹

    2013-01-01

    目的 对本研究小组分离的鲍曼不动杆菌噬菌体AB3进行测序和基因组生物信息学分析,阐明其亲缘关系.方法 采用鸟枪法和重叠群组装的策略对噬菌体AB3进行基因组测序,并通过EditSeq、tRNAscan-SE、TRF、FindTerm、ORF finder、BPROM、GeneMarkTM、Clustalx、phylip等软件对所获噬菌体AB3基因组的一般特性、编码基因的功能预测、RNA聚合酶基因系统的进化进行分析.结果 噬菌体AB3基因组为全长31 185 bp的双链DNA,G+C含量为39.18%,包含28个预测基因,1个转录终止子和4个可能的启动子序列.结论 基因分析和RNA聚合酶基因进化分析显示噬菌体AB3与噬菌体AB1类似,均属于phiKMV-like病毒属.%Objective To sequence the Acinetobacter baumannii bacteriophage AB3 separated by our team,and to perform bioinformatics analysis,so as to identify the classification of its phylogenetic relationship.Methods Shot-gun library and config package strategy were carried out for sequencing the genome of bacteriophage AB3.Such software as EditSeq,tRNAscan-SE,TRF,FindTerm,ORF finder,BPROM and GeneMarkTM were applied to predict both general characteristics of the bacteriophage AB3 genome and the coding gene function.In addition,the evolution of RNA polymerase gene system was analyzed with the software of Clustalx and phylip.Results The genome of bacteriophage AB3 was a double-strand DNA with a full length of 31 185 bp,in which G + C mol% was 39.18% and 28 predicted genes,1 transcription terminator,and 4 possible promoter sequences were included.Conclusion Genetic analysis and RNA polymerase gene evolution analysis indicate that bacteriophage AB3 is similar to bacteriophage AB1,and both of them belong to phiKMV-like virus.

  12. Application of bioinformatics in tropical medicine

    Institute of Scientific and Technical Information of China (English)

    Wiwanitkit V

    2008-01-01

    Bioinformatics is a usage of information technology to help solve biological problems by designing novel and in-cisive algorithms and methods of analyses.Bioinformatics becomes a discipline vital in the era of post-genom-ics.In this review article,the application of bioinformatics in tropical medicine will be presented and dis-cussed.

  13. Directional reflectance analysis for identifying counterfeit drugs: Preliminary study.

    Science.gov (United States)

    Wilczyński, Sławomir; Koprowski, Robert; Błońska-Fajfrowska, Barbara

    2016-05-30

    The WHO estimates that up to 10% of drugs on the market may be counterfeit. In order to prevent intensification of the phenomenon of drug counterfeiting, the methods for distinguishing genuine medicines from fake ones need to be developed. The aim of this study was to try to develop simple, reproducible and inexpensive method for distinguishing between original and counterfeit medicines based on the measurement of directional reflectance. The directional reflectance of 6 original Viagra(®) tablets (Pfizer) and 24 (4 different batches) counterfeit tablets (imitating Viagra(®)) was examined in six spectral bands: from 0.9 to 1.1 μm, from 1.9 to 2.6 μm, from 3.0 to 4.0 μm, from 3.0 to 5.0 μm, from 4.0 to 5.0 μm, from 8.0 to 12.0 μm, and for two angles of incidence, 20° and 60°. Directional hemispherical reflectometer was applied to measure directional reflectance. Significant statistical differences between the directional reflectance of the original Viagra(®) and counterfeit tablets were registered. Any difference in the value of directional reflectance for any spectral band or angle of incidence identifies the drug as a fake one. The proposed method of directional reflectance analysis enables to differentiate between the real Viagra(®) and fake tablets. Directional reflectance analysis is a fast (measurement time under 5s), cheap and reproducible method which does not require expensive equipment or specialized laboratory staff. It also seems to be an effective method, however, the effectiveness will be assessed after the extension of research. PMID:26977587

  14. Bioinformatics decoding the genome

    CERN Document Server

    CERN. Geneva; Deutsch, Sam; Michielin, Olivier; Thomas, Arthur; Descombes, Patrick

    2006-01-01

    Extracting the fundamental genomic sequence from the DNA From Genome to Sequence : Biology in the early 21st century has been radically transformed by the availability of the full genome sequences of an ever increasing number of life forms, from bacteria to major crop plants and to humans. The lecture will concentrate on the computational challenges associated with the production, storage and analysis of genome sequence data, with an emphasis on mammalian genomes. The quality and usability of genome sequences is increasingly conditioned by the careful integration of strategies for data collection and computational analysis, from the construction of maps and libraries to the assembly of raw data into sequence contigs and chromosome-sized scaffolds. Once the sequence is assembled, a major challenge is the mapping of biologically relevant information onto this sequence: promoters, introns and exons of protein-encoding genes, regulatory elements, functional RNAs, pseudogenes, transposons, etc. The methodological ...

  15. Review of bioinformatics data analysis in alternative splicing%可变剪接的生物信息数据分析综述

    Institute of Scientific and Technical Information of China (English)

    章天骄

    2012-01-01

    前体mRNA的可变剪接是扩大真核生物蛋白质组多样性的重要基因调控机制.可变剪接的错误调节可以引起多种人类疾病.由于高通量技术的发展,生物信息学成为可变剪接研究的主要手段.本文总结了可变剪接在生物信息学领域的研究方法,同时也分析并预测了可变剪接的发展方向.%Alternative pre - mRNA splicing is an important gene regulation mechanism for expanding proteomic diversity in higher eukaryotes. The misregulation of alternative splicing underlies many human diseases. With the development of high - throughput technology, bioinformatics becomes to the main method in study of alternative splicing. This article summarizes the bioinformatics methods in alternative splicing research, as well as analyzes and predicts the direction of alternative splicing.

  16. A Sensitivity Analysis Approach to Identify Key Environmental Performance Factors

    Directory of Open Access Journals (Sweden)

    Xi Yu

    2014-01-01

    Full Text Available Life cycle assessment (LCA is widely used in design phase to reduce the product’s environmental impacts through the whole product life cycle (PLC during the last two decades. The traditional LCA is restricted to assessing the environmental impacts of a product and the results cannot reflect the effects of changes within the life cycle. In order to improve the quality of ecodesign, it is a growing need to develop an approach which can reflect the changes between the design parameters and product’s environmental impacts. A sensitivity analysis approach based on LCA and ecodesign is proposed in this paper. The key environmental performance factors which have significant influence on the products’ environmental impacts can be identified by analyzing the relationship between environmental impacts and the design parameters. Users without much environmental knowledge can use this approach to determine which design parameter should be first considered when (redesigning a product. A printed circuit board (PCB case study is conducted; eight design parameters are chosen to be analyzed by our approach. The result shows that the carbon dioxide emission during the PCB manufacture is highly sensitive to the area of PCB panel.

  17. Performance Analysis: Work Control Events Identified January - August 2010

    Energy Technology Data Exchange (ETDEWEB)

    De Grange, C E; Freeman, J W; Kerr, C E; Holman, G; Marsh, K; Beach, R

    2011-01-14

    This performance analysis evaluated 24 events that occurred at LLNL from January through August 2010. The analysis identified areas of potential work control process and/or implementation weaknesses and several common underlying causes. Human performance improvement and safety culture factors were part of the causal analysis of each event and were analyzed. The collective significance of all events in 2010, as measured by the occurrence reporting significance category and by the proportion of events that have been reported to the DOE ORPS under the ''management concerns'' reporting criteria, does not appear to have increased in 2010. The frequency of reporting in each of the significance categories has not changed in 2010 compared to the previous four years. There is no change indicating a trend in the significance category and there has been no increase in the proportion of occurrences reported in the higher significance category. Also, the frequency of events, 42 events reported through August 2010, is not greater than in previous years and is below the average of 63 occurrences per year at LLNL since 2006. Over the previous four years, an average of 43% of the LLNL's reported occurrences have been reported as either ''management concerns'' or ''near misses.'' In 2010, 29% of the occurrences have been reported as ''management concerns'' or ''near misses.'' This rate indicates that LLNL is now reporting fewer ''management concern'' and ''near miss'' occurrences compared to the previous four years. From 2008 to the present, LLNL senior management has undertaken a series of initiatives to strengthen the work planning and control system with the primary objective to improve worker safety. In 2008, the LLNL Deputy Director established the Work Control Integrated Project Team to develop the core requirements and graded

  18. Bioinformatics analysis of the BRX gene family in grape%葡萄BRX基因家族生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    李文芳; 陈佰鸿; 毛娟; 马宗桓; 杨世茂

    2015-01-01

    BRX gene family is a class of transcriptional factors that present only in plant, and it plays an important role in the regulation of cell proliferation and root elongation in Arabidopsis. With the approaches of bioinformatics, BRX gene family present in the grape genome was performed in silico cloning, genome localization, protein structure, physical and chemical characteristics, secondary structure as well as subcellular localization prediction and analysis. Moreover, the evolutionary relationships of BRX gene family derived from other plants were predicted. Genome mapping results showed that:6 BRX genes in grape genome were located on 3 chromosomes, VvBRX1 and VvBRX2 in chromosome 2, VvBRX4 and VvBRX3 in chromosome 9, VvBRX6 and VvBRX5 in chromosome 11. The encoded proteins contain 360-560 amino acids, the relative molecular weight (61 884.4) and the pI value (9.38) of VvBRX5 were the maximum, while the relative molecular weight ( 40 239. 1 ) and the pI value ( 6. 23 ) of VvBRX1 were the minimum. The study suggested that there were some differences between the amino acid sequences of different members, while they all were hydrophobic proteins. The 6 BRX amino acid sequences mainly contain alpha helix and random coil and did not have transmembrane domains and signal peptide. Gene structure analysis showed that the 6 BRX genes contained exons and introns structure. Subcellular localization analysis showed that six VvBRX genes are located in nucleus. Phylogenetic analysis showed that VvBRX1 and VvBRX2 had the closest relationship with populus euphratica, the homology was 96%. VvBRX3 and VvBRX4 were clustered a class with Ricinus communis, Jatropha curcas, Citrus sinensis, Theobroma cacao and Glycine max, indicating that the evolutionary relationships were very closer. VvBRX5 was significantly separated from other VvBRX genes. VvBRX6 had the closest relationship with Nelumbo nucifera. These experimental results provide a significant foundation for further research

  19. Bioinformatics analysis of the BRX gene family in grape%葡萄BRX基因家族生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    李文芳; 陈佰鸿; 毛娟; 马宗桓; 杨世茂

    2015-01-01

    BRX基因家族是一类植物特有的转录因子家族,在拟南芥中参与调节根细胞的增殖与伸长. 利用生物信息学方法对葡萄基因组中存在的BRX 基因家族进行了电子克隆,并对其进行了基因组的定位、蛋白质的结构、理化性质、二级结构及亚细胞定位的预测与分析,并对其与其它植物进化的亲缘关系进行了研究. 基因组定位结果发现:葡萄基因组中6个BRX基因集中分布在3条染色体上,其中VvBRX1和VvBRX2分布在第2条染色体上,VvBRX3和VvBRX4分布在第9条染色体上,VvBRX5和VvBRX6分布在第11条染色体上;编码蛋白的氨基酸数目为360~560个,VvBRX5 的相对分子量(61 884.4)和理论等电点(9.38)均最大,而VvBRX1 的相对分子量(40 239.1)和理论等电点(6.23)均最小. 研究显示,不同成员间氨基酸数目、氨基酸序列间存在一定的差异,但都为疏水性蛋白;α-螺旋和无规则卷曲为6个BRX氨基酸序列的主要组成部分;均不存在跨膜域及信号肽. 基因结构分析表明,6个BRX基因都含有外显子和内含子结构. 亚细胞定位分析表明:6个VvBRX基因均定位于细胞核. 系统进化分析结果表明,VvBRX1、VvBRX2基因与胡杨的亲缘关系最近,相似性达96%;VvBRX3、VvBRX4与蓖麻、麻疯树、柑橘、可可、大豆聚为一类,说明其进化关系较近;VvBRX5与其它VvBRX基因明显分开;VvBRX6基因与莲的亲缘关系最近. 试验结果为葡萄BRX 基因家族的克隆和功能分析奠定了一定的研究基础.%BRX gene family is a class of transcriptional factors that present only in plant, and it plays an important role in the regulation of cell proliferation and root elongation in Arabidopsis. With the approaches of bioinformatics, BRX gene family present in the grape genome was performed in silico cloning, genome localization, protein structure, physical and chemical characteristics, secondary structure as well as subcellular localization prediction

  20. Hydroxysteroid dehydrogenases (HSDs) in bacteria: a bioinformatic perspective.

    Science.gov (United States)

    Kisiela, Michael; Skarka, Adam; Ebert, Bettina; Maser, Edmund

    2012-03-01

    Steroidal compounds including cholesterol, bile acids and steroid hormones play a central role in various physiological processes such as cell signaling, growth, reproduction, and energy homeostasis. Hydroxysteroid dehydrogenases (HSDs), which belong to the superfamily of short-chain dehydrogenases/reductases (SDR) or aldo-keto reductases (AKR), are important enzymes involved in the steroid hormone metabolism. HSDs function as an enzymatic switch that controls the access of receptor-active steroids to nuclear hormone receptors and thereby mediate a fine-tuning of the steroid response. The aim of this study was the identification of classified functional HSDs and the bioinformatic annotation of these proteins in all complete sequenced bacterial genomes followed by a phylogenetic analysis. For the bioinformatic annotation we constructed specific hidden Markov models in an iterative approach to provide a reliable identification for the specific catalytic groups of HSDs. Here, we show a detailed phylogenetic analysis of 3α-, 7α-, 12α-HSDs and two further functional related enzymes (3-ketosteroid-Δ(1)-dehydrogenase, 3-ketosteroid-Δ(4)(5α)-dehydrogenase) from the superfamily of SDRs. For some bacteria that have been previously reported to posses a specific HSD activity, we could annotate the corresponding HSD protein. The dominating phyla that were identified to express HSDs were that of Actinobacteria, Proteobacteria, and Firmicutes. Moreover, some evolutionarily more ancient microorganisms (e.g., Cyanobacteria and Euryachaeota) were found as well. A large number of HSD-expressing bacteria constitute the normal human gastro-intestinal flora. Another group of bacteria were originally isolated from natural habitats like seawater, soil, marine and permafrost sediments. These bacteria include polycyclic aromatic hydrocarbons-degrading species such as Pseudomonas, Burkholderia and Rhodococcus. In conclusion, HSDs are found in a wide variety of microorganisms including

  1. CROSSWORK for Glycans: Glycan Identificatin Through Mass Spectrometry and Bioinformatics

    DEFF Research Database (Denmark)

    Rasmussen, Morten; Thaysen-Andersen, Morten; Højrup, Peter

      We have developed "GLYCANthrope " - CROSSWORKS for glycans:  a bioinformatics tool, which assists in identifying N-linked glycosylated peptides as well as their glycan moieties from MS2 data of enzymatically digested glycoproteins. The program runs either as a stand-alone application or as a plug...

  2. Application of Bioinformatics and Systems Biology in Medicinal Plant Studies

    Institute of Scientific and Technical Information of China (English)

    DENG You-ping; AI Jun-mei; XIAO Pei-gen

    2010-01-01

    One important purpose to investigate medicinal plants is to understand genes and enzymes that govern the biological metabolic process to produce bioactive compounds.Genome wide high throughput technologies such as genomics,transcriptomics,proteomics and metabolomics can help reach that goal.Such technologies can produce a vast amount of data which desperately need bioinformatics and systems biology to process,manage,distribute and understand these data.By dealing with the"omics"data,bioinformatics and systems biology can also help improve the quality of traditional medicinal materials,develop new approaches for the classification and authentication of medicinal plants,identify new active compounds,and cultivate medicinal plant species that tolerate harsh environmental conditions.In this review,the application of bioinformatics and systems biology in medicinal plants is briefly introduced.

  3. An introduction to proteome bioinformatics.

    Science.gov (United States)

    Jones, Andrew R; Hubbard, Simon J

    2010-01-01

    This book is part of the Methods in Molecular Biology series, and provides a general overview of computational approaches used in proteome research. In this chapter, we give an overview of the scope of the book in terms of current proteomics experimental techniques and the reasons why computational approaches are needed. We then give a summary of each chapter, which together provide a picture of the state of the art in proteome bioinformatics research.

  4. The Bioinformatic Analysis of the blcap Gene%宫颈癌相关blcap基因的生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    刘娟; 熊金虎; 伍欣星

    2004-01-01

    BLCAP is a potential gene for suppression of cervical carcinoma, which was found by analysing the cervical carcinoma specimen with the oncogene and anti-oncogene cDNA microarray. Basing on the bioinformatical analyses, we try to predict the function of blcap gene. The results show that there are several genes that highly resemble with blcap. The comparability between the sequences of blcap and Homo sapiens mRNA (DKFZp564M053) or BC10 is 99% and 87%, respectively. The protein encoded by BLCAP is composed of Leu(19.5%), pro(9.19%), ser(8.04%)、 cys(8.04%) and other amino acids. The secondary structure of the N-terminal of BLCAP encoded protein is an alpha helix. In the C-terminal, it is beta sheet and in the middle, it is coil. The of the terminals is more hydrophobile than the middle region. Between 45-55aa, there is a transmembrane region. Therefore, we forecast the BLCAP is a member of transmembrane protein I. By analyzing the signal peptide and the procedure of blcap gene with the program of SignalP (V1.1), we found a cleavage site in 59-66aa. By using the program of Netpho, we predicted there might be three phospholate sites at 68aa, 73aa and 78aa. At 78-81aa, we found a typical [ST]-X [2] -[DE] structure—the phospholate site of tyrosine protein kinase, which might be related to its function. Bioinformatic studies of blcap provided the foundation for the function researches of BLCAP in laboratory.

  5. Electronic cloning and bioinformatics analysis of the pig LBP gene%猪LBP基因电子克隆及生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    喻礼怀; 王靖; 傅聪; 顾雯雯; 王龙

    2012-01-01

    The lipopolysaccharide-binding (LBP) gene participates in inflammatory reaction about LPS, belonging to the important immune related gene. Using electronic cloning technology, the LBP gene was cloned by seed sequence based on the human LBP gene (NM004139). The results showed that LBP gene consisted of 1 446 bp, coding 481 ami-no acids. The amino acid sequence had similarity compared with human being (74. 2%), mice (64. 8%) , rat (63. 2%), cattle (77. 1%), dog (74. 6%). Evolutionary tree analysis showed that the pig LBP gene was the nearest to the cattle and furthest from the rat. The bioinformatics analysis showed that the molecular weight of LBP protein was 53. 036 7 ku and the theory isoelectric point was 6. 43. Sub-cellular localization forecast showed that the LBP belonged to secrete proteins in the mitochondria (44. 4%), golgi apparatus (22. 2%). A signal peptide existed in N-terminal, and there might be a schizolysis site in 25 - 26 amino acids. There was a higher hydrophobicity in N-terminal located in the signal peptide. The maximum hydrophobic value was 2. 478 and maximum hydrophilic value was 1. 956. The membrane protein was made up of extracellular domain (1 - 6 amino acids), transmembrane region (7 - 29 amino acids), intracellular domain (30 - 481 amino acids). There were nine Ser, eight Thr, two Tyr, which might be the protein kinase phosphorylation site. The structure of extracellular region of LBP protein showed a forniciform helix structure, and consisted of a lot of α-helix in inside and β-sheet in outside of the arc and they arranged parallely and alternately. There were BPI1 and BPI2 conservative structure domain in 33 - 256 and 271 - 474 amino acid residues.%利用电子克隆技术,以人脂多糖结合蛋白(LBP)基因(NM 004139)为种子序列,克隆猪LBP基因.结果表明:所克隆的LBP基因开放阅读框长为1 446 bp,编码481个氨基酸.推导其氨基酸序列与人、小鼠、大鼠、牛、狗的相似性分别为74.2%、64

  6. Bioinformatics and Microarray Analysis of miRNAs in Aged Female Mice Model Implied New Molecular Mechanisms for Impaired Fracture Healing

    Science.gov (United States)

    He, Bing; Zhang, Zong-Kang; Liu, Jin; He, Yi-Xin; Tang, Tao; Li, Jie; Guo, Bao-Sheng; Lu, Ai-Ping; Zhang, Bao-Ting; Zhang, Ge

    2016-01-01

    Impaired fracture healing in aged females is still a challenge in clinics. MicroRNAs (miRNAs) play important roles in fracture healing. This study aims to identify the miRNAs that potentially contribute to the impaired fracture healing in aged females. Transverse femoral shaft fractures were created in adult and aged female mice. At post-fracture 0-, 2- and 4-week, the fracture sites were scanned by micro computed tomography to confirm that the fracture healing was impaired in aged female mice and the fracture calluses were collected for miRNA microarray analysis. A total of 53 significantly differentially expressed miRNAs and 5438 miRNA-target gene interactions involved in bone fracture healing were identified. A novel scoring system was designed to analyze the miRNA contribution to impaired fracture healing (RCIFH). Using this method, 11 novel miRNAs were identified to impair fracture healing at 2- or 4-week post-fracture. Thereafter, function analysis of target genes was performed for miRNAs with high RCIFH values. The results showed that high RCIFH miRNAs in aged female mice might impair fracture healing not only by down-regulating angiogenesis-, chondrogenesis-, and osteogenesis-related pathways, but also by up-regulating osteoclastogenesis-related pathway, which implied the essential roles of these high RCIFH miRNAs in impaired fracture healing in aged females, and might promote the discovery of novel therapeutic strategies. PMID:27527150

  7. Bioinformatics Analysis Reveals Distinct Molecular Characteristics of Hepatitis B-Related Hepatocellular Carcinomas from Very Early to Advanced Barcelona Clinic Liver Cancer Stages.

    Directory of Open Access Journals (Sweden)

    Fan-Yun Kong

    Full Text Available Hepatocellular carcinoma (HCCis the fifth most common malignancy associated with high mortality. One of the risk factors for HCC is chronic hepatitis B virus (HBV infection. The treatment strategy for the disease is dependent on the stage of HCC, and the Barcelona clinic liver cancer (BCLC staging system is used in most HCC cases. However, the molecular characteristics of HBV-related HCC in different BCLC stages are still unknown. Using GSE14520 microarray data from HBV-related HCC cases with BCLC stages from 0 (very early stage to C (advanced stage in the gene expression omnibus (GEO database, differentially expressed genes (DEGs, including common DEGs and unique DEGs in different BCLC stages, were identified. These DEGs were located on different chromosomes. The molecular functions and biology pathways of DEGs were identified by gene ontology (GO analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG pathway analysis, and the interactome networks of DEGs were constructed using the NetVenn online tool. The results revealed that both common DEGs and stage-specific DEGs were associated with various molecular functions and were involved in special biological pathways. In addition, several hub genes were found in the interactome networks of DEGs. The identified DEGs and hub genes promote our understanding of the molecular mechanisms underlying the development of HBV-related HCC through the different BCLC stages, and might be used as staging biomarkers or molecular targets for the treatment of HCC with HBV infection.

  8. Bioinformatics Analysis Reveals Distinct Molecular Characteristics of Hepatitis B-Related Hepatocellular Carcinomas from Very Early to Advanced Barcelona Clinic Liver Cancer Stages

    Science.gov (United States)

    Hu, Wei; Kou, Yan-Bo; You, Hong-Juan; Liu, Xiao-Mei; Zheng, Kui-Yang; Tang, Ren-Xian

    2016-01-01

    Hepatocellular carcinoma (HCC)is the fifth most common malignancy associated with high mortality. One of the risk factors for HCC is chronic hepatitis B virus (HBV) infection. The treatment strategy for the disease is dependent on the stage of HCC, and the Barcelona clinic liver cancer (BCLC) staging system is used in most HCC cases. However, the molecular characteristics of HBV-related HCC in different BCLC stages are still unknown. Using GSE14520 microarray data from HBV-related HCC cases with BCLC stages from 0 (very early stage) to C (advanced stage) in the gene expression omnibus (GEO) database, differentially expressed genes (DEGs), including common DEGs and unique DEGs in different BCLC stages, were identified. These DEGs were located on different chromosomes. The molecular functions and biology pathways of DEGs were identified by gene ontology (GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, and the interactome networks of DEGs were constructed using the NetVenn online tool. The results revealed that both common DEGs and stage-specific DEGs were associated with various molecular functions and were involved in special biological pathways. In addition, several hub genes were found in the interactome networks of DEGs. The identified DEGs and hub genes promote our understanding of the molecular mechanisms underlying the development of HBV-related HCC through the different BCLC stages, and might be used as staging biomarkers or molecular targets for the treatment of HCC with HBV infection. PMID:27454179

  9. Technosciences in Academia: Rethinking a Conceptual Framework for Bioinformatics Undergraduate Curricula

    Science.gov (United States)

    Symeonidis, Iphigenia Sofia

    This paper aims to elucidate guiding concepts for the design of powerful undergraduate bioinformatics degrees which will lead to a conceptual framework for the curriculum. "Powerful" here should be understood as having truly bioinformatics objectives rather than enrichment of existing computer science or life science degrees on which bioinformatics degrees are often based. As such, the conceptual framework will be one which aims to demonstrate intellectual honesty in regards to the field of bioinformatics. A synthesis/conceptual analysis approach was followed as elaborated by Hurd (1983). The approach takes into account the following: bioinfonnatics educational needs and goals as expressed by different authorities, five undergraduate bioinformatics degrees case-studies, educational implications of bioinformatics as a technoscience and approaches to curriculum design promoting interdisciplinarity and integration. Given these considerations, guiding concepts emerged and a conceptual framework was elaborated. The practice of bioinformatics was given a closer look, which led to defining tool-integration skills and tool-thinking capacity as crucial areas of the bioinformatics activities spectrum. It was argued, finally, that a process-based curriculum as a variation of a concept-based curriculum (where the concepts are processes) might be more conducive to the teaching of bioinformatics given a foundational first year of integrated science education as envisioned by Bialek and Botstein (2004). Furthermore, the curriculum design needs to define new avenues of communication and learning which bypass the traditional disciplinary barriers of academic settings as undertaken by Tador and Tidmor (2005) for graduate studies.

  10. Probabilistic models and machine learning in structural bioinformatics

    DEFF Research Database (Denmark)

    Hamelryck, Thomas

    2009-01-01

    and experimental determination of macromolecular structure that are based on such methods. These developments include generative models of protein structure, the estimation of the parameters of energy functions that are used in structure prediction, the superposition of macromolecules and structure determination......Structural bioinformatics is concerned with the molecular structure of biomacromolecules on a genomic scale, using computational methods. Classic problems in structural bioinformatics include the prediction of protein and RNA structure from sequence, the design of artificial proteins or enzymes......, and the automated analysis and comparison of biomacromolecules in atomic detail. The determination of macromolecular structure from experimental data (for example coming from nuclear magnetic resonance, X-ray crystallography or small angle X-ray scattering) has close ties with the field of structural bioinformatics...

  11. Approaches in integrative bioinformatics towards the virtual cell

    CERN Document Server

    Chen, Ming

    2014-01-01

    Approaches in Integrative Bioinformatics provides a basic introduction to biological information systems, as well as guidance for the computational analysis of systems biology. This book also covers a range of issues and methods that reveal the multitude of omics data integration types and the relevance that integrative bioinformatics has today. Topics include biological data integration and manipulation, modeling and simulation of metabolic networks, transcriptomics and phenomics, and virtual cell approaches, as well as a number of applications of network biology. It helps to illustrat

  12. Bioinformatics-driven identification and examination of candidate genes for non-alcoholic fatty liver disease.

    Directory of Open Access Journals (Sweden)

    Karina Banasik

    Full Text Available OBJECTIVE: Candidate genes for non-alcoholic fatty liver disease (NAFLD identified by a bioinformatics approach were examined for variant associations to quantitative traits of NAFLD-related phenotypes. RESEARCH DESIGN AND METHODS: By integrating public database text mining, trans-organism protein-protein interaction transferal, and information on liver protein expression a protein-protein interaction network was constructed and from this a smaller isolated interactome was identified. Five genes from this interactome were selected for genetic analysis. Twenty-one tag single-nucleotide polymorphisms (SNPs which captured all common variation in these genes were genotyped in 10,196 Danes, and analyzed for association with NAFLD-related quantitative traits, type 2 diabetes (T2D, central obesity, and WHO-defined metabolic syndrome (MetS. RESULTS: 273 genes were included in the protein-protein interaction analysis and EHHADH, ECHS1, HADHA, HADHB, and ACADL were selected for further examination. A total of 10 nominal statistical significant associations (P<0.05 to quantitative metabolic traits were identified. Also, the case-control study showed associations between variation in the five genes and T2D, central obesity, and MetS, respectively. Bonferroni adjustments for multiple testing negated all associations. CONCLUSIONS: Using a bioinformatics approach we identified five candidate genes for NAFLD. However, we failed to provide evidence of associations with major effects between SNPs in these five genes and NAFLD-related quantitative traits, T2D, central obesity, and MetS.

  13. 'In silico expression analysis', a novel PathoPlant web tool to identify abiotic and biotic stress conditions associated with specific cis-regulatory sequences.

    Science.gov (United States)

    Bolívar, Julio C; Machens, Fabian; Brill, Yuri; Romanov, Artyom; Bülow, Lorenz; Hehl, Reinhard

    2014-01-01

    Using bioinformatics, putative cis-regulatory sequences can be easily identified using pattern recognition programs on promoters of specific gene sets. The abundance of predicted cis-sequences is a major challenge to associate these sequences with a possible function in gene expression regulation. To identify a possible function of the predicted cis-sequences, a novel web tool designated 'in silico expression analysis' was developed that correlates submitted cis-sequences with gene expression data from Arabidopsis thaliana. The web tool identifies the A. thaliana genes harbouring the sequence in a defined promoter region and compares the expression of these genes with microarray data. The result is a hierarchy of abiotic and biotic stress conditions to which these genes are most likely responsive. When testing the performance of the web tool, known cis-regulatory sequences were submitted to the 'in silico expression analysis' resulting in the correct identification of the associated stress conditions. When using a recently identified novel elicitor-responsive sequence, a WT-box (CGACTTTT), the 'in silico expression analysis' predicts that genes harbouring this sequence in their promoter are most likely Botrytis cinerea induced. Consistent with this prediction, the strongest induction of a reporter gene harbouring this sequence in the promoter is observed with B. cinerea in transgenic A. thaliana. DATABASE URL: http://www.pathoplant.de/expression_analysis.php. PMID:24727366

  14. Intrageneric Primer Design: Bringing Bioinformatics Tools to the Class

    Science.gov (United States)

    Lima, Andre O. S.; Garces, Sergio P. S.

    2006-01-01

    Bioinformatics is one of the fastest growing scientific areas over the last decade. It focuses on the use of informatics tools for the organization and analysis of biological data. An example of their importance is the availability nowadays of dozens of software programs for genomic and proteomic studies. Thus, there is a growing field (private…

  15. Integrative Functional Genomics Analysis of Sustained Polyploidy Phenotypes in Breast Cancer Cells Identifies an Oncogenic Profile for GINS2

    Directory of Open Access Journals (Sweden)

    Juha K. Rantala

    2010-11-01

    Full Text Available Aneuploidy is among the most obvious differences between normal and cancer cells. However, mechanisms contributing to development and maintenance of aneuploid cell growth are diverse and incompletely understood. Functional genomics analyses have shown that aneuploidy in cancer cells is correlated with diffuse gene expression signatures and aneuploidy can arise by a variety of mechanisms, including cytokinesis failures, DNA endoreplication, and possibly through polyploid intermediate states. To identify molecular processes contributing to development of aneuploidy, we used a cell spot microarray technique to identify genes inducing polyploidy and/or allowing maintenance of polyploid cell growth in breast cancer cells. Of 5760 human genes screened, 177 were found to induce severe DNA content alterations on prolonged transient silencing. Association with response to DNA damage stimulus and DNA repair was found to be the most enriched cellular processes among the candidate genes. Functional validation analysis of these genes highlighted GINS2 as the highest ranking candidate inducing polyploidy, accumulation of endogenous DNA damage, and impairing cell proliferation on inhibition. The cell growth inhibition and induction of polyploidy by suppression of GINS2 was verified in a panel of breast cancer cell lines. Bioinformatic analysis of published gene expression and DNA copy number studies of clinical breast tumors suggested GINS2 to be associated with the aggressive characteristics of a subgroup of breast cancers in vivo. In addition, nuclear GINS2 protein levels distinguished actively proliferating cancer cells suggesting potential use of GINS2 staining as a biomarker of cell proliferation as well as a potential therapeutic target.

  16. Market Analysis Identifies Community and School Education Goals.

    Science.gov (United States)

    Lindle, Jane C.

    1989-01-01

    Principals must realize the positive effects that marketing can have on improving schools and building support for them. Market analysis forces clarification of the competing needs and interests present in the community. The four marketing phases are needs assessment, analysis, goal setting, and public relations and advertising. (MLH)

  17. Isolation, characterization, and bioinformatic analysis of calmodulin-binding protein cmbB reveals a novel tandem IP22 repeat common to many Dictyostelium and Mimivirus proteins.

    Science.gov (United States)

    O'Day, Danton H; Suhre, Karsten; Myre, Michael A; Chatterjee-Chakraborty, Munmun; Chavez, Sara E

    2006-08-01

    A novel calmodulin-binding protein cmbB from Dictyostelium discoideum is encoded in a single gene. Northern analysis reveals two cmbB transcripts first detectable at 4 h during multicellular development. Western blotting detects an approximately 46.6 kDa protein. Sequence analysis and calmodulin-agarose binding studies identified a "classic" calcium-dependent calmodulin-binding domain (179IPKSLRSLFLGKGYNQPLEF198) but structural analyses suggest binding may not involve classic alpha-helical calmodulin-binding. The cmbB protein is comprised of tandem repeats of a newly identified IP22 motif ([I,L]Pxxhxxhxhxxxhxxxhxxxx; where h = any hydrophobic amino acid) that is highly conserved and a more precise representation of the FNIP repeat. At least eight Acanthamoeba polyphaga Mimivirus proteins and over 100 Dictyostelium proteins contain tandem arrays of the IP22 motif and its variants. cmbB also shares structural homology to YopM, from the plague bacterium Yersenia pestis. PMID:16777069

  18. 心肌桥粒盘状球蛋白 JUP 的生物信息学分析%Bioinformatics Analysis of Plakoglobin Gene and Protein

    Institute of Scientific and Technical Information of China (English)

    任晨霞; 曹文君

    2016-01-01

    目的::对 J up 基因及其蛋白进行生物信息学分析,为研究 J up 基因功能及其在心肌病形成和发展中的作用提供一定的理论基础。方法:运用生物信息学相关数据库和软件对 J up 基因的结构、单核苷酸多态性、JUP 蛋白分子的理化性质、二级结构、序列保守性、蛋白质相互作用网络进行分析。结果:人J up 基因编码区存在11个 SNPs 位点。J up 基因编码745个氨基酸组成的多肽,属亲水蛋白,稳定性不高,其主要二级结构元件为α-螺旋,进化中高度保守,属于 ARM 超家族。与 JUP 存在相互作用的基因和蛋白主要是桥粒组成成分与经典钙粘素信号途径组分。结论:J up 基因突变和 JUP 蛋白表达量的改变可引起相关的心肌病,本文对 J up 基因及其蛋白进行系统的生物信息学分析,为进一步实验研究其在心肌病的形成和发展的调控机制奠定基础。%Objective:To analyze the Jup gene and its protein with bioinformatics,and explore its action in process of cardiomyopathy and development.Methods:Bioinformatics methods were applied to analyze the genetic structure and single nucleotide polymorphisms of Jup,and physicochemical properties,secondary structure,hereditary conservation,protein interaction networks of JUP.Results:Eleven SNPs were found in the coding regions,including five missense mutations.JUP protein was comprised of 745 amino acid residues and was a hydrophilic unstable protein.The main secondary structure elements were alpha helix,and it was highly conserved in evolution and belonged to the ARM superfamily.The interaction network with JUP were mainly desmosome components and classical cadherin signaling pathway components.Conclusion:The changed expression of JUP can cause certain cardiomyopathy,so we analyze the insightful information of Jup gene and its protein by bioinformatics in this paper,laying a foundation for further experimental study

  19. Bioinformatics in Africa: The Rise of Ghana?

    Directory of Open Access Journals (Sweden)

    Thomas K Karikari

    2015-09-01

    Full Text Available Until recently, bioinformatics, an important discipline in the biological sciences, was largely limited to countries with advanced scientific resources. Nonetheless, several developing countries have lately been making progress in bioinformatics training and applications. In Africa, leading countries in the discipline include South Africa, Nigeria, and Kenya. However, one country that is less known when it comes to bioinformatics is Ghana. Here, I provide a first description of the development of bioinformatics activities in Ghana and how these activities contribute to the overall development of the discipline in Africa. Over the past decade, scientists in Ghana have been involved in publications incorporating bioinformatics analyses, aimed at addressing research questions in biomedical science and agriculture. Scarce research funding and inadequate training opportunities are some of the challenges that need to be addressed for Ghanaian scientists to continue developing their expertise in bioinformatics.

  20. Technical phosphoproteomic and bioinformatic tools useful in cancer research.

    Science.gov (United States)

    López, Elena; Wesselink, Jan-Jaap; López, Isabel; Mendieta, Jesús; Gómez-Puertas, Paulino; Muñoz, Sarbelio Rodríguez

    2011-01-01

    Reversible protein phosphorylation is one of the most important forms of cellular regulation. Thus, phosphoproteomic analysis of protein phosphorylation in cells is a powerful tool to evaluate cell functional status. The importance of protein kinase-regulated signal transduction pathways in human cancer has led to the development of drugs that inhibit protein kinases at the apex or intermediary levels of these pathways. Phosphoproteomic analysis of these signalling pathways will provide important insights for operation and connectivity of these pathways to facilitate identification of the best targets for cancer therapies. Enrichment of phosphorylated proteins or peptides from tissue or bodily fluid samples is required. The application of technologies such as phosphoenrichments, mass spectrometry (MS) coupled to bioinformatics tools is crucial for the identification and quantification of protein phosphorylation sites for advancing in such relevant clinical research. A combination of different phosphopeptide enrichments, quantitative techniques and bioinformatic tools is necessary to achieve good phospho-regulation data and good structural analysis of protein studies. The current and most useful proteomics and bioinformatics techniques will be explained with research examples. Our aim in this article is to be helpful for cancer research via detailing proteomics and bioinformatic tools. PMID:21967744

  1. Rice transcriptome analysis to identify possible herbicide quinclorac detoxification genes

    OpenAIRE

    Xu, Wenying; Di, Chao; Zhou, Shaoxia; Liu, Jia; LI Li; Liu, Fengxia; Yang, Xinling; Ling, Yun; Su, Zhen

    2015-01-01

    Quinclorac is a highly selective auxin-type herbicide and is widely used in the effective control of barnyard grass in paddy rice fields, improving the world's rice yield. The herbicide mode of action of quinclorac has been proposed, and hormone interactions affecting quinclorac signaling has been identified. Because of widespread use, quinclorac may be transported outside rice fields with the drainage waters, leading to soil and water pollution and other environmental health problems. In thi...

  2. Establishing bioinformatics research in the Asia Pacific

    Directory of Open Access Journals (Sweden)

    Tammi Martti

    2006-12-01

    Full Text Available Abstract In 1998, the Asia Pacific Bioinformatics Network (APBioNet, Asia's oldest bioinformatics organisation was set up to champion the advancement of bioinformatics in the Asia Pacific. By 2002, APBioNet was able to gain sufficient critical mass to initiate the first International Conference on Bioinformatics (InCoB bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2006 Conference was organized as the 5th annual conference of the Asia-Pacific Bioinformatics Network, on Dec. 18–20, 2006 in New Delhi, India, following a series of successful events in Bangkok (Thailand, Penang (Malaysia, Auckland (New Zealand and Busan (South Korea. This Introduction provides a brief overview of the peer-reviewed manuscripts accepted for publication in this Supplement. It exemplifies a typical snapshot of the growing research excellence in bioinformatics of the region as we embark on a trajectory of establishing a solid bioinformatics research culture in the Asia Pacific that is able to contribute fully to the global bioinformatics community.

  3. Identifiability analysis of the CSTR river water quality model.

    Science.gov (United States)

    Chen, J; Deng, Y

    2006-01-01

    Conceptual river water quality models are widely known to lack identifiability. The causes for that can be due to model structure errors, observational errors and less frequent samplings. Although significant efforts have been directed towards better identification of river water quality models, it is not clear whether a given model is structurally identifiable. Information is also limited regarding the contribution of different unidentifiability sources. Taking the widely applied CSTR river water quality model as an example, this paper presents a theoretical proof that the CSTR model is indeed structurally identifiable. Its uncertainty is thus dominantly from observational errors and less frequent samplings. Given the current monitoring accuracy and sampling frequency, the unidentifiability from sampling frequency is found to be more significant than that from observational errors. It is also noted that there is a crucial sampling frequency between 0.1 and 1 day, over which the simulated river system could be represented by different illusions and the model application could be far less reliable.

  4. Association analysis identifies ZNF750 regulatory variants in psoriasis

    Directory of Open Access Journals (Sweden)

    Birnbaum Ramon Y

    2011-12-01

    Full Text Available Abstract Background Mutations in the ZNF750 promoter and coding regions have been previously associated with Mendelian forms of psoriasis and psoriasiform dermatitis. ZNF750 encodes a putative zinc finger transcription factor that is highly expressed in keratinocytes and represents a candidate psoriasis gene. Methods We examined whether ZNF750 variants were associated with psoriasis in a large case-control population. We sequenced the promoter and exon regions of ZNF750 in 716 Caucasian psoriasis cases and 397 Caucasian controls. Results We identified a total of 47 variants, including 38 rare variants of which 35 were novel. Association testing identified two ZNF750 haplotypes associated with psoriasis (p ZNF750 promoter and 5' UTR variants displayed a 35-55% reduction of ZNF750 promoter activity, consistent with the promoter activity reduction seen in a Mendelian psoriasis family with a ZNF750 promoter variant. However, the rare promoter and 5' UTR variants identified in this study did not strictly segregate with the psoriasis phenotype within families. Conclusions Two haplotypes of ZNF750 and rare 5' regulatory variants of ZNF750 were found to be associated with psoriasis. These rare 5' regulatory variants, though not causal, might serve as a genetic modifier of psoriasis.

  5. Comprehensive analysis of the N-glycan biosynthetic pathway using bioinformatics to generate UniCorn: A theoretical N-glycan structure database.

    Science.gov (United States)

    Akune, Yukie; Lin, Chi-Hung; Abrahams, Jodie L; Zhang, Jingyu; Packer, Nicolle H; Aoki-Kinoshita, Kiyoko F; Campbell, Matthew P

    2016-08-01

    Glycan structures attached to proteins are comprised of diverse monosaccharide sequences and linkages that are produced from precursor nucleotide-sugars by a series of glycosyltransferases. Databases of these structures are an essential resource for the interpretation of analytical data and the development of bioinformatics tools. However, with no template to predict what structures are possible the human glycan structure databases are incomplete and rely heavily on the curation of published, experimentally determined, glycan structure data. In this work, a library of 45 human glycosyltransferases was used to generate a theoretical database of N-glycan structures comprised of 15 or less monosaccharide residues. Enzyme specificities were sourced from major online databases including Kyoto Encyclopedia of Genes and Genomes (KEGG) Glycan, Consortium for Functional Glycomics (CFG), Carbohydrate-Active enZymes (CAZy), GlycoGene DataBase (GGDB) and BRENDA. Based on the known activities, more than 1.1 million theoretical structures and 4.7 million synthetic reactions were generated and stored in our database called UniCorn. Furthermore, we analyzed the differences between the predicted glycan structures in UniCorn and those contained in UniCarbKB (www.unicarbkb.org), a database which stores experimentally described glycan structures reported in the literature, and demonstrate that UniCorn can be used to aid in the assignment of ambiguous structures whilst also serving as a discovery database. PMID:27318307

  6. Combinational risk factors of metabolic syndrome identified by fuzzy neural network analysis of health-check data

    Directory of Open Access Journals (Sweden)

    Ushida Yasunori

    2012-08-01

    Full Text Available Abstract Background Lifestyle-related diseases represented by metabolic syndrome develop as results of complex interaction. By using health check-up data from two large studies collected during a long-term follow-up, we searched for risk factors associated with the development of metabolic syndrome. Methods In our original study, we selected 77 case subjects who developed metabolic syndrome during the follow-up and 152 healthy control subjects who were free of lifestyle-related risk components from among 1803 Japanese male employees. In a replication study, we selected 2196 case subjects and 2196 healthy control subjects from among 31343 other Japanese male employees. By means of a bioinformatics approach using a fuzzy neural network (FNN, we searched any significant combinations that are associated with MetS. To ensure that the risk combination selected by FNN analysis was statistically reliable, we performed logistic regression analysis including adjustment. Results We selected a combination of an elevated level of γ-glutamyltranspeptidase (γ-GTP and an elevated white blood cell (WBC count as the most significant combination of risk factors for the development of metabolic syndrome. The FNN also identified the same tendency in a replication study. The clinical characteristics of γ-GTP level and WBC count were statistically significant even after adjustment, confirming that the results obtained from the fuzzy neural network are reasonable. Correlation ratio showed that an elevated level of γ-GTP is associated with habitual drinking of alcohol and a high WBC count is associated with habitual smoking. Conclusions This result obtained by fuzzy neural network analysis of health check-up data from large long-term studies can be useful in providing a personalized novel diagnostic and therapeutic method involving the γ-GTP level and the WBC count.

  7. Analysis of Electronic Clone and Bioinformatics on Mlo Gene from Maize%一个玉米Mlo基因的电子克隆与生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    邬晓勇; 孙雁霞; 何钢; 颜军; 苟小军

    2011-01-01

    Mlo基因家族在植物抗病方面有极大的优势,但有些Mlo基因的功能还未知.经序列拼接电子克隆得到1个玉米的Mlo基因,采用生物信息学方法预测分析了编码蛋白的一、二、三级结构,并对其功能进行了预测.结果表明:玉米Mlo基因编码的蛋白有一个保守的DUF1084结构域,此结构域功能在植物中尚未知.生物信息学分析表明,此蛋白很可能是一种类似于G蛋白偶联受体的膜结合转运蛋白而参与到信号传递过程中.%Mlo gene family has great advantages in plant disease resistance, but some of the Mlo gene function is unknown yet. A Mlo gene was obtained by sequence splicing and electronic cloned from Zea mays (Maize). The primary, secondary and tertiary structure also the function of the encoded protein were analyzed and predicted by bioinformatics analysis. The results showed that the encoded protein had a conserved domain named DUF1084 but its function was unknown yet in plant. Bioinformatics analysis showed that the encoded protein may be a membrane-bound transporter protein similar to the G protein-coupled receptor and participate in the process of signal transduction in plant.

  8. Bioinformatics Prediction of Polyketide Synthase Gene Clusters from Mycosphaerella fijiensis.

    Directory of Open Access Journals (Sweden)

    Roslyn D Noar

    Full Text Available Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity for six of the PKS sequences. One of the PKS sequences was not similar (< 60% similarity to sequences in any of the 103 genomes, suggesting that it encodes a unique compound. Comparison of the M. fijiensis PKS sequences with those of two other banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that

  9. Bioinformatics Prediction of Polyketide Synthase Gene Clusters from Mycosphaerella fijiensis.

    Science.gov (United States)

    Noar, Roslyn D; Daub, Margaret E

    2016-01-01

    Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity) for six of the PKS sequences. One of the PKS sequences was not similar (banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that they may encode polyketides important in pathogenicity. PMID:27388157

  10. Using Rasch Analysis to Identify Uncharacteristic Responses to Undergraduate Assessments

    Science.gov (United States)

    Edwards, Antony; Alcock, Lara

    2010-01-01

    Rasch Analysis is a statistical technique that is commonly used to analyse both test data and Likert survey data, to construct and evaluate question item banks, and to evaluate change in longitudinal studies. In this article, we introduce the dichotomous Rasch model, briefly discussing its assumptions. Then, using data collected in an…

  11. Evaluation of energy system analysis techniques for identifying underground facilities

    Energy Technology Data Exchange (ETDEWEB)

    VanKuiken, J.C.; Kavicky, J.A.; Portante, E.C. [and others

    1996-03-01

    This report describes the results of a study to determine the feasibility and potential usefulness of applying energy system analysis techniques to help detect and characterize underground facilities that could be used for clandestine activities. Four off-the-shelf energy system modeling tools were considered: (1) ENPEP (Energy and Power Evaluation Program) - a total energy system supply/demand model, (2) ICARUS (Investigation of Costs and Reliability in Utility Systems) - an electric utility system dispatching (or production cost and reliability) model, (3) SMN (Spot Market Network) - an aggregate electric power transmission network model, and (4) PECO/LF (Philadelphia Electric Company/Load Flow) - a detailed electricity load flow model. For the purposes of most of this work, underground facilities were assumed to consume about 500 kW to 3 MW of electricity. For some of the work, facilities as large as 10-20 MW were considered. The analysis of each model was conducted in three stages: data evaluation, base-case analysis, and comparative case analysis. For ENPEP and ICARUS, open source data from Pakistan were used for the evaluations. For SMN and PECO/LF, the country data were not readily available, so data for the state of Arizona were used to test the general concept.

  12. Identifying Colluvial Slopes by Airborne LiDAR Analysis

    Science.gov (United States)

    Kasai, M.; Marutani, T.; Yoshida, H.

    2015-12-01

    Colluvial slopes are one of major sources of landslides. Identifying the locations of the slopes will help reduce the risk of disasters, by avoiding building infrastructure and properties nearby, or if they are already there, by applying appropriate counter measures before it suddenly moves. In this study, airborne LiDAR data was analyzed to find their geomorphic characteristics to use for extracting their locations. The study site was set in the suburb of Sapporo City, Hokkaido in Japan. The area is underlain by Andesite and Tuff and prone to landslides. Slope angle and surface roughness were calculated from 5 m resolution DEM. These filters were chosen because colluvial materials deposit at around the angle of repose and accumulation of loose materials was considered to form a peculiar surface texture differentiable from other slope types. Field survey conducted together suggested that colluvial slopes could be identified by the filters with a probability of 80 percent. Repeat LiDAR monitoring of the site by an unmanned helicopter indicated that those slopes detected as colluviums appeared to be moving at a slow rate. In comparison with a similar study from the crushed zone in Japan, the range of slope angle indicative of colluviums agreed with the Sapporo site, while the texture was rougher due to larger debris composing the slopes.

  13. Bioinformatics Training Network (BTN): a community resource for bioinformatics trainers

    DEFF Research Database (Denmark)

    Schneider, Maria V.; Walter, Peter; Blatter, Marie-Claude;

    2012-01-01

    to the development of ‘high-throughput biology’, the need for training in the field of bioinformatics, in particular, is seeing a resurgence: it has been defined as a key priority by many Institutions and research programmes and is now an important component of many grant proposals. Nevertheless, when it comes...... to planning and preparing to meet such training needs, tension arises between the reward structures that predominate in the scientific community which compel individuals to publish or perish, and the time that must be devoted to the design, delivery and maintenance of high-quality training materials....... Conversely, there is much relevant teaching material and training expertise available worldwide that, were it properly organized, could be exploited by anyone who needs to provide training or needs to set up a new course. To do this, however, the materials would have to be centralized in a database...

  14. Temperature-based Instanton Analysis: Identifying Vulnerability in Transmission Networks

    Energy Technology Data Exchange (ETDEWEB)

    Kersulis, Jonas [Univ. of Michigan, Ann Arbor, MI (United States); Hiskens, Ian [Univ. of Michigan, Ann Arbor, MI (United States); Chertkov, Michael [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Backhaus, Scott N. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Bienstock, Daniel [Columbia Univ., New York, NY (United States)

    2015-04-08

    A time-coupled instanton method for characterizing transmission network vulnerability to wind generation fluctuation is presented. To extend prior instanton work to multiple-time-step analysis, line constraints are specified in terms of temperature rather than current. An optimization formulation is developed to express the minimum wind forecast deviation such that at least one line is driven to its thermal limit. Results are shown for an IEEE RTS-96 system with several wind-farms.

  15. Gene expression analysis identifies global gene dosage sensitivity in cancer

    DEFF Research Database (Denmark)

    Fehrmann, Rudolf S. N.; Karjalainen, Juha M.; Krajewska, Malgorzata;

    2015-01-01

    Many cancer-associated somatic copy number alterations (SCNAs) are known. Currently, one of the challenges is to identify the molecular downstream effects of these variants. Although several SCNAs are known to change gene expression levels, it is not clear whether each individual SCNA affects gene...... expression. We reanalyzed 77,840 expression profiles and observed a limited set of 'transcriptional components' that describe well-known biology, explain the vast majority of variation in gene expression and enable us to predict the biological function of genes. On correcting expression profiles...... for these components, we observed that the residual expression levels (in 'functional genomic mRNA' profiling) correlated strongly with copy number. DNA copy number correlated positively with expression levels for 99% of all abundantly expressed human genes, indicating global gene dosage sensitivity. By applying...

  16. Predicting missing links and identifying spurious links via likelihood analysis

    Science.gov (United States)

    Pan, Liming; Zhou, Tao; Lü, Linyuan; Hu, Chin-Kun

    2016-03-01

    Real network data is often incomplete and noisy, where link prediction algorithms and spurious link identification algorithms can be applied. Thus far, it lacks a general method to transform network organizing mechanisms to link prediction algorithms. Here we use an algorithmic framework where a network’s probability is calculated according to a predefined structural Hamiltonian that takes into account the network organizing principles, and a non-observed link is scored by the conditional probability of adding the link to the observed network. Extensive numerical simulations show that the proposed algorithm has remarkably higher accuracy than the state-of-the-art methods in uncovering missing links and identifying spurious links in many complex biological and social networks. Such method also finds applications in exploring the underlying network evolutionary mechanisms.

  17. 9th International Conference on Practical Applications of Computational Biology and Bioinformatics

    CERN Document Server

    Rocha, Miguel; Fdez-Riverola, Florentino; Paz, Juan

    2015-01-01

    This proceedings presents recent practical applications of Computational Biology and  Bioinformatics. It contains the proceedings of the 9th International Conference on Practical Applications of Computational Biology & Bioinformatics held at University of Salamanca, Spain, at June 3rd-5th, 2015. The International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB) is an annual international meeting dedicated to emerging and challenging applied research in Bioinformatics and Computational Biology. Biological and biomedical research are increasingly driven by experimental techniques that challenge our ability to analyse, process and extract meaningful knowledge from the underlying data. The impressive capabilities of next generation sequencing technologies, together with novel and ever evolving distinct types of omics data technologies, have put an increasingly complex set of challenges for the growing fields of Bioinformatics and Computational Biology. The analysis o...

  18. Cloning and Bioinformatic Analysis of Full-length actin Gene of Culex pipiens pallens%淡色库蚊肌动蛋白全长基因的克隆及生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    王晓宇; 刘虎岐

    2012-01-01

    Culex pipiens pallens is the main carrier of multiple viruses and parasites,and there is close relationship between actin protein and pesticide resistance. Based on gene fragments obtained by resistance-related design reverse transcription and amplification primers, using rapid amplification of cD-NA ends method (RACE), the full length of the gene was amplified from a resistant strain of Culex to analyze their bioinformatic characteristics. The actin gene obtained in Culex pipiens has 1 708 bp coding 377 amino acids. The bioinformatic analysis showed that actin gene was a membrane protein with one helix, one signal peptide cleavage point and twenty-seven phosphorylation sites. The full-length actin gene and biological information lay the foundation for clarifying the resistance mechanism of the actin gene and development of new pesticides.%为阐明肌动蛋白抗药性相关机制及研制新型卫生杀虫剂奠定基础,根据库蚊抗性与敏感品系差异表达的EST片段,设计特异扩增引物,运用RACE技术从淡色库蚊抗性品系中扩增出该抗性相关基因的全长cDNA序列,分析其生物信息学特性.结果表明,获得淡色库蚊肌动蛋白基因cDNA全长1708 bp序列,其编码377个氨基酸;该基因编码的蛋白为膜蛋白,具有27个跨膜螺旋、1个信号肽切割位点、27个磷酸化位点.

  19. Distribution of cold adaptation proteins in microbial mats in Lake Joyce, Antarctica: Analysis of metagenomic data by using two bioinformatics tools.

    Science.gov (United States)

    Koo, Hyunmin; Hakim, Joseph A; Fisher, Phillip R E; Grueneberg, Alexander; Andersen, Dale T; Bej, Asim K

    2016-01-01

    In this study, we report the distribution and abundance of cold-adaptation proteins in microbial mat communities in the perennially ice-covered Lake Joyce, located in the McMurdo Dry Valleys, Antarctica. We have used MG-RAST and R code bioinformatics tools on Illumina HiSeq2000 shotgun metagenomic data and compared the filtering efficacy of these two methods on cold-adaptation proteins. Overall, the abundance of cold-shock DEAD-box protein A (CSDA), antifreeze proteins (AFPs), fatty acid desaturase (FAD), trehalose synthase (TS), and cold-shock family of proteins (CSPs) were present in all mat samples at high, moderate, or low levels, whereas the ice nucleation protein (INP) was present only in the ice and bulbous mat samples at insignificant levels. Considering the near homogeneous temperature profile of Lake Joyce (0.08-0.29 °C), the distribution and abundance of these proteins across various mat samples predictively correlated with known functional attributes necessary for microbial communities to thrive in this ecosystem. The comparison of the MG-RAST and the R code methods showed dissimilar occurrences of the cold-adaptation protein sequences, though with insignificant ANOSIM (R = 0.357; p-value = 0.012), ADONIS (R(2) = 0.274; p-value = 0.03) and STAMP (p-values = 0.521-0.984) statistical analyses. Furthermore, filtering targeted sequences using the R code accounted for taxonomic groups by avoiding sequence redundancies, whereas the MG-RAST provided total counts resulting in a higher sequence output. The results from this study revealed for the first time the distribution of cold-adaptation proteins in six different types of microbial mats in Lake Joyce, while suggesting a simpler and more manageable user-defined method of R code, as compared to a web-based MG-RAST pipeline. PMID:26578243

  20. 结核分枝杆菌rpoB基因的生物信息学分析%Bioinformatics analysis of rpoB gene in Mycobacterium tuberculosis

    Institute of Scientific and Technical Information of China (English)

    赵启明; 李萍

    2012-01-01

    In this paper the research on rpoB gene sequence was illustrated using bioinformatic methods about physical and chemical properties, bydrophilicity, signal peptide, glycosylation/phosphorylation and predicting secondary and tertiary structure according to protein sequence. Result showed that RNA polymerase beta subunit was an unstable acidic with abundant Valine, Glutamate and Leu-cine, highly phosphorylation and without signal peptide. α-helix and random coil are primary secondary structure components of beta subunit. Three-dimension structure was also obtained using homology modeling softare, and Ramachandran Plot. Result showed it was a good assessment of three-dimension model.%通过生物信息学的方法对rpoB基因及其蛋白质序列的理化性质、亲/疏水性、信号肽、糖基化位点、磷酸化位点、二级结构和三级结构等进行预测分析.结果表明,RNA聚合酶β亚基为富含Val、Glu及Leu的非稳定亲水性蛋白,其中不合信号肽,磷酸化程度较高.α螺旋和无规则卷曲是RNA聚合酶β亚基的主要二级结构元件.用同源建模方法构建三维结构,通过Ramachandran Plot对模型进行评估得到了合理的RNA聚合酶β亚基结构模型.分析rpoB基因及其编码蛋白质的特征对于研究结核杆菌致病及耐利福平药物机理有着重要的意义.

  1. Biology in 'silico': The Bioinformatics Revolution.

    Science.gov (United States)

    Bloom, Mark

    2001-01-01

    Explains the Human Genome Project (HGP) and efforts to sequence the human genome. Describes the role of bioinformatics in the project and considers it the genetics Swiss Army Knife, which has many different uses, for use in forensic science, medicine, agriculture, and environmental sciences. Discusses the use of bioinformatics in the high school…

  2. Using "Arabidopsis" Genetic Sequences to Teach Bioinformatics

    Science.gov (United States)

    Zhang, Xiaorong

    2009-01-01

    This article describes a new approach to teaching bioinformatics using "Arabidopsis" genetic sequences. Several open-ended and inquiry-based laboratory exercises have been designed to help students grasp key concepts and gain practical skills in bioinformatics, using "Arabidopsis" leucine-rich repeat receptor-like kinase (LRR RLK) genetic…

  3. A Mathematical Optimization Problem in Bioinformatics

    Science.gov (United States)

    Heyer, Laurie J.

    2008-01-01

    This article describes the sequence alignment problem in bioinformatics. Through examples, we formulate sequence alignment as an optimization problem and show how to compute the optimal alignment with dynamic programming. The examples and sample exercises have been used by the author in a specialized course in bioinformatics, but could be adapted…

  4. Online Bioinformatics Tutorials | Office of Cancer Genomics

    Science.gov (United States)

    Bioinformatics is a scientific discipline that applies computer science and information technology to help understand biological processes. The NIH provides a list of free online bioinformatics tutorials, either generated by the NIH Library or other institutes, which includes introductory lectures and "how to" videos on using various tools.

  5. Rapid Development of Bioinformatics Education in China

    Science.gov (United States)

    Zhong, Yang; Zhang, Xiaoyan; Ma, Jian; Zhang, Liang

    2003-01-01

    As the Human Genome Project experiences remarkable success and a flood of biological data is produced, bioinformatics becomes a very "hot" cross-disciplinary field, yet experienced bioinformaticians are urgently needed worldwide. This paper summarises the rapid development of bioinformatics education in China, especially related undergraduate…

  6. Network stratification analysis for identifying function-specific network layers.

    Science.gov (United States)

    Zhang, Chuanchao; Wang, Jiguang; Zhang, Chao; Liu, Juan; Xu, Dong; Chen, Luonan

    2016-04-22

    A major challenge of systems biology is to capture the rewiring of biological functions (e.g. signaling pathways) in a molecular network. To address this problem, we proposed a novel computational framework, namely network stratification analysis (NetSA), to stratify the whole biological network into various function-specific network layers corresponding to particular functions (e.g. KEGG pathways), which transform the network analysis from the gene level to the functional level by integrating expression data, the gene/protein network and gene ontology information altogether. The application of NetSA in yeast and its comparison with a traditional network-partition both suggest that NetSA can more effectively reveal functional implications of network rewiring and extract significant phenotype-related biological processes. Furthermore, for time-series or stage-wise data, the function-specific network layer obtained by NetSA is also shown to be able to characterize the disease progression in a dynamic manner. In particular, when applying NetSA to hepatocellular carcinoma and type 1 diabetes, we can derive functional spectra regarding the progression of the disease, and capture active biological functions (i.e. active pathways) in different disease stages. The additional comparison between NetSA and SPIA illustrates again that NetSA could discover more complete biological functions during disease progression. Overall, NetSA provides a general framework to stratify a network into various layers of function-specific sub-networks, which can not only analyze a biological network on the functional level but also investigate gene rewiring patterns in biological processes. PMID:26879865

  7. Bioinformatic analysis on the microRNA profiling of pancreatic cancer cell line Panc-1%胰腺癌Panc-1细胞microRNA差异表达谱生物信息学的分析

    Institute of Scientific and Technical Information of China (English)

    单振兴; 周小艳; 李天亮; 韩金祥; 崔亚洲

    2011-01-01

    目的:对胰腺癌细胞差异miRNAs表达谱进行生物信息学分析,以期从整体水平揭示microRNA在胰腺癌癌变和进展中的作用.方法:采用含有924条探针的microRNA微阵列检测胰腺癌Panc-1细胞,以3T3成纤维细胞为对照,筛选Panc-1细胞特异性microRNA表达谱;然后对上调和下调microRNA的靶基因进行Gene Ontology、Pathway和TFBS转录因子结合位点分析,以及构建microRNA和靶基因相互作用网络.结果:与3T3成纤维细胞的microRNA表达谱比较,筛选出9个Panc-1上调microRNA,20个下调microRNA.TargetScan和miRanda软件预测出1 166个microRNA靶基因在Panc-1细胞中上调,212个靶基因下调.以上靶基因在DNA代谢、细胞间信号和胞质溶胶3种GO中富集显著;靶基因共涉及50条信号通路,其中富集度P<0.05的信号通路有6条;转录因子结合位点分析表明,CEBP-β、NF-kB和p53等对于上调以及下调的microRNA可能都有调节作用;microRNA和靶基因的相互作用网络分析表明,HIF-1A等基因连接度高.结论:利用生物信息学方法对胰腺癌细胞microRNA表达谱进行数据分析,可以为进一步了解胰腺癌的发病机制提供新的思路.%OBJECTIVE: To perform bioinformatic anlysis on microRNA profiling of pancreatic cancer cells in order to il-lustrate the role of microRNA in carcinogenesis and progres-sion in pancreatic cancer. METHODS: The specific microRNA of pancreatic cancer Panc-1 was obtained by a microarray con-taining 924 probes with 3T3 fibroblast as a control. Then tar-geted genes of microRNAs were predicted and Gene Ontology, gene network, pathway and Transcription factor binding site (TFBS) analyses were performed. RESULTS: Nine microR-Nas were up-regulated in Panc-1 cells, and 20 microRNAs were down-regulated. 1 166 up-regulated micro-targeted genes and 212 down-regulated microRNA targeted genes were pre-dicted by TargetScan and miRanda software. For Gene Ontol-ogy analysis, the genes involved

  8. Sequence Analysis of Hypothetical Proteins from Helicobacter pylori 26695 to Identify Potential Virulence Factors

    Science.gov (United States)

    Naqvi, Ahmad Abu Turab; Anjum, Farah; Khan, Faez Iqbal; Islam, Asimul; Ahmad, Faizan

    2016-01-01

    Helicobacter pylori is a Gram-negative bacteria that is responsible for gastritis in human. Its spiral flagellated body helps in locomotion and colonization in the host environment. It is capable of living in the highly acidic environment of the stomach with the help of acid adaptive genes. The genome of H. pylori 26695 strain contains 1,555 coding genes that encode 1,445 proteins. Out of these, 340 proteins are characterized as hypothetical proteins (HP). This study involves extensive analysis of the HPs using an established pipeline which comprises various bioinformatics tools and databases to find out probable functions of the HPs and identification of virulence factors. After extensive analysis of all the 340 HPs, we found that 104 HPs are showing characteristic similarities with the proteins with known functions. Thus, on the basis of such similarities, we assigned probable functions to 104 HPs with high confidence and precision. All the predicted HPs contain representative members of diverse functional classes of proteins such as enzymes, transporters, binding proteins, regulatory proteins, proteins involved in cellular processes and other proteins with miscellaneous functions. Therefore, we classified 104 HPs into aforementioned functional groups. During the virulence factors analysis of the HPs, we found 11 HPs are showing significant virulence. The identification of virulence proteins with the help their predicted functions may pave the way for drug target estimation and development of effective drug to counter the activity of that protein. PMID:27729842

  9. Incorporating Genomics and Bioinformatics across the Life Sciences Curriculum

    Energy Technology Data Exchange (ETDEWEB)

    Ditty, Jayna L.; Kvaal, Christopher A.; Goodner, Brad; Freyermuth, Sharyn K.; Bailey, Cheryl; Britton, Robert A.; Gordon, Stuart G.; Heinhorst, Sabine; Reed, Kelynne; Xu, Zhaohui; Sanders-Lorenz, Erin R.; Axen, Seth; Kim, Edwin; Johns, Mitrick; Scott, Kathleen; Kerfeld, Cheryl A.

    2011-08-01

    into courses or independent research projects requires infrastructure for organizing and assessing student work. Here, we present a new platform for faculty to keep current with the rapidly changing field of bioinformatics, the Integrated Microbial Genomes Annotation Collaboration Toolkit (IMG-ACT). It was developed by instructors from both research-intensive and predominately undergraduate institutions in collaboration with the Department of Energy-Joint Genome Institute (DOE-JGI) as a means to innovate and update undergraduate education and faculty development. The IMG-ACT program provides a cadre of tools, including access to a clearinghouse of genome sequences, bioinformatics databases, data storage, instructor course management, and student notebooks for organizing the results of their bioinformatic investigations. In the process, IMG-ACT makes it feasible to provide undergraduate research opportunities to a greater number and diversity of students, in contrast to the traditional mentor-to-student apprenticeship model for undergraduate research, which can be too expensive and time-consuming to provide for every undergraduate. The IMG-ACT serves as the hub for the network of faculty and students that use the system for microbial genome analysis. Open access of the IMG-ACT infrastructure to participating schools ensures that all types of higher education institutions can utilize it. With the infrastructure in place, faculty can focus their efforts on the pedagogy of bioinformatics, involvement of students in research, and use of this tool for their own research agenda. What the original faculty members of the IMG-ACT development team present here is an overview of how the IMG-ACT program has affected our development in terms of teaching and research with the hopes that it will inspire more faculty to get involved.

  10. The Screening of Genes Sensitive to Long-Term, Low-Level Microwave Exposure and Bioinformatic Analysis of Potential Correlations to Learning and Memory

    Institute of Scientific and Technical Information of China (English)

    ZHAO Ya Li; LI Ying Xian; MA Hong Bo; LI Dong; LI Hai Liang; JIANG Rui; KAN Guang Han; YANG Zhen Zhong; HUANG Zeng Xin

    2015-01-01

    Objective To gain a better understanding of gene expression changes in the brain following microwave exposure in mice. This study hopes to reveal mechanisms contributing to microwave-induced learning and memory dysfunction. Methods Mice were exposed to whole body 2100 MHz microwaves with specific absorption rates (SARs) of 0.45 W/kg, 1.8 W/kg, and 3.6 W/kg for 1 hour daily for 8 weeks. Differentially expressing genes in the brains were screened using high-density oligonucleotide arrays, with genes showing more significant differences further confirmed by RT-PCR. Results The gene chip results demonstrated that 41 genes (0.45 W/kg group), 29 genes (1.8 W/kg group), and 219 genes (3.6 W/kg group) were differentially expressed. GO analysis revealed that these differentially expressed genes were primarily involved in metabolic processes, cellular metabolic processes, regulation of biological processes, macromolecular metabolic processes, biosynthetic processes, cellular protein metabolic processes, transport, developmental processes, cellular component organization, etc. KEGG pathway analysis showed that these genes are mainly involved in pathways related to ribosome, Alzheimer's disease, Parkinson's disease, long-term potentiation, Huntington's disease, and Neurotrophin signaling. Construction of a protein interaction network identified several important regulatory genes including synbindin (sbdn), Crystallin (CryaB), PPP1CA, Ywhaq, Psap, Psmb1, Pcbp2, etc., which play important roles in the processes of learning and memory. Conclusion Long-term, low-level microwave exposure may inhibit learning and memory by affecting protein and energy metabolic processes and signaling pathways relating to neurological functions or diseases.

  11. Identifying a preservation zone using multi–criteria decision analysis

    Directory of Open Access Journals (Sweden)

    Farashi, A.

    2016-03-01

    Full Text Available Zoning of a protected area is an approach to partition landscape into various land use units. The management of these landscape units can reduce conflicts caused by human activities. Tandoreh National Park is one of the most biologically diverse, protected areas in Iran. Although the area is generally designed to protect biodiversity, there are many conflicts between biodiversity conservation and human activities. For instance, the area is highly controversial and has been considered as an impediment to local economic development, such as tourism, grazing, road construction, and cultivation. In order to reduce human conflicts with biodiversity conservation in Tandoreh National Park, safe zones need to be established and human activities need to be moved out of the zones. In this study we used a systematic methodology to integrate a participatory process with Geographic Information Systems (GIS using a multi–criteria decision analysis (MCDA technique to guide a zoning scheme for the Tandoreh National Park, Iran. Our results show that the northern and eastern parts of the Tandoreh National Park that were close to rural areas and farmlands returned less desirability for selection as a preservation area. Rocky Mountains were the most important and most destructed areas and abandoned plains were the least important criteria for preservation in the area. Furthermore, the results reveal that the land properties were considered to be important for protection based on the obtaine

  12. Robust enzyme design: bioinformatic tools for improved protein stability.

    Science.gov (United States)

    Suplatov, Dmitry; Voevodin, Vladimir; Švedas, Vytas

    2015-03-01

    The ability of proteins and enzymes to maintain a functionally active conformation under adverse environmental conditions is an important feature of biocatalysts, vaccines, and biopharmaceutical proteins. From an evolutionary perspective, robust stability of proteins improves their biological fitness and allows for further optimization. Viewed from an industrial perspective, enzyme stability is crucial for the practical application of enzymes under the required reaction conditions. In this review, we analyze bioinformatic-driven strategies that are used to predict structural changes that can be applied to wild type proteins in order to produce more stable variants. The most commonly employed techniques can be classified into stochastic approaches, empirical or systematic rational design strategies, and design of chimeric proteins. We conclude that bioinformatic analysis can be efficiently used to study large protein superfamilies systematically as well as to predict particular structural changes which increase enzyme stability. Evolution has created a diversity of protein properties that are encoded in genomic sequences and structural data. Bioinformatics has the power to uncover this evolutionary code and provide a reproducible selection of hotspots - key residues to be mutated in order to produce more stable and functionally diverse proteins and enzymes. Further development of systematic bioinformatic procedures is needed to organize and analyze sequences and structures of proteins within large superfamilies and to link them to function, as well as to provide knowledge-based predictions for experimental evaluation.

  13. Meconium microbiome analysis identifies bacteria correlated with premature birth.

    Directory of Open Access Journals (Sweden)

    Alexandria N Ardissone

    Full Text Available Preterm birth is the second leading cause of death in children under the age of five years worldwide, but the etiology of many cases remains enigmatic. The dogma that the fetus resides in a sterile environment is being challenged by recent findings and the question has arisen whether microbes that colonize the fetus may be related to preterm birth. It has been posited that meconium reflects the in-utero microbial environment. In this study, correlations between fetal intestinal bacteria from meconium and gestational age were examined in order to suggest underlying mechanisms that may contribute to preterm birth.Meconium from 52 infants ranging in gestational age from 23 to 41 weeks was collected, the DNA extracted, and 16S rRNA analysis performed. Resulting taxa of microbes were correlated to clinical variables and also compared to previous studies of amniotic fluid and other human microbiome niches.Increased detection of bacterial 16S rRNA in meconium of infants of <33 weeks gestational age was observed. Approximately 61·1% of reads sequenced were classified to genera that have been reported in amniotic fluid. Gestational age had the largest influence on microbial community structure (R = 0·161; p = 0·029, while mode of delivery (C-section versus vaginal delivery had an effect as well (R = 0·100; p = 0·044. Enterobacter, Enterococcus, Lactobacillus, Photorhabdus, and Tannerella, were negatively correlated with gestational age and have been reported to incite inflammatory responses, suggesting a causative role in premature birth.This provides the first evidence to support the hypothesis that the fetal intestinal microbiome derived from swallowed amniotic fluid may be involved in the inflammatory response that leads to premature birth.

  14. Evolutionary and bioinformatic analysis of the spike glycoprotein gene of H120 vaccine strain protectotype of infectious bronchitis virus from India.

    Science.gov (United States)

    Kamble, Nitin Machindra; Pillai, Aravind S; Gaikwad, Satish S; Shukla, Sanjeev Kumar; Khulape, Sagar Aashok; Dey, Sohini; Mohan, C Madhan

    2016-01-01

    The infectious bronchitis virus is a causative agent of avian infectious bronchitis (AIB), and is is an important disease that produces severe economic losses to the poultry industry worldwide. Recent AIB outbreaks in India have been associated with poor growth in broilers, drop in egg production, and thin egg shells in layers. The complete spike gene of Indian AIB vaccine strain was amplified and sequenced using a conventional reverse transcription polymerase chain reaction and is submitted to the GenBank (accession no KF188436). Phylogenetic analysis revealed that the vaccine strain currently used belongs to H120 genotype, an attenuated strain of Massachusetts (Mass) serotype. Nucleotide and amino acid sequence comparisons have shown that the reported spike gene from Indian isolates have 71.8%-99% and 71.4%-96.9% genetic similarity with the sequenced H120 strain. The study identifies live attenuated IBV vaccine strain, which is routinely used for vaccination, for the first time. Based on nucleotide and amino acid relatedness studies of the vaccine strain with reported IBV sequences from India, it is shown that the current vaccine strain is efficient in controlling the IBV infection. Continuous monitoring of IBV outbreaks by sequencing for genotyping and in vivo cross protection studies for serotyping is not only important for epidemiological investigation but also for evaluation of efficacy of the current vaccine. PMID:25311758

  15. 苹果SBP基因家族生物信息学分析%Bioinformatics Analysis of SBP Gene Family in Apple

    Institute of Scientific and Technical Information of China (English)

    刘更森; 慕茜; 戴洪义; 上官凌飞; 张玉刚

    2011-01-01

    This article firstly analyzed the phylogenesis of 42 SBP protein sequences and the localization of SBP genome in apple by using bioinformatics method, and then predicted and analyzed their amino acid composition, physical and chemical characteristics, as well as secondary and tertiary structures, meanwhile analyzed the relation between the SBP gene family in apple and that in Arabi-dopsis lhaliana. The results indicated that the 42 protein sequences in apple and 16 SBP protein sequences in Arabidopsis thaliana could be divided into 7 subtribes, which illuminated that SBP genes had high conservatism between apple and Arabidopsis thaliana. It was also found that these 42 SBP genes distributed on 12 chromosomes. There were some differences in the number of amino acid and hydrophobic quality of amino acid sequences among different subfamilies. The predictive results of secondary structure found that the main compositions of 42 amino acid sequences were randomly curled and a - helix, and the tertiary structure of all 42 amino acid sequences was similar.%首先利用生物信息学方法对苹果42条SBP蛋白序列的系统发生和SBP基因组定位进行分析,然后对其氨基酸组成成分、理化性质以及二级和三级结构进行预测和分析,同时还分析了苹果与拟南芥的SBP基因家族之间的联系.结果显示着42条蛋白序列与拟南芥16条SBP蛋白序列一起被分成了7个亚族,拟南芥与苹果SBP基因间具有较高的保守性.基因组定位结果显示42条SBP基因分布在12条染色体上.研究还发现不同亚族间氨基酸数目、氨基酸序列疏水性存在一定的差异;二级结构预测分析发现,42条氨基酸序列以随机卷曲和α-螺旋为主要组成部分,而且42条氨基酸序列三维结构相似.

  16. 猪CB1基因的生物信息学分析%Bioinformatics Analysis on Cannabinoid Receptors 1 of Swine

    Institute of Scientific and Technical Information of China (English)

    魏星灿; 贾青; 陶隽; 胡慧艳

    2013-01-01

    运用生物信息学方法分析了猪和其他21个物种 CB1基因CDs序列的系统进化关系和猪C B1基因编码蛋白质的理化性质与结构。结果显示,C B1基因同源性较高,且在进化中受到纯化选择的作用。猪CB1蛋白为疏水性跨膜蛋白,包含472个氨基酸残基,不含信号肽。其一级结构含有23个磷酸化位点、6个糖基化位点;二级结构含有47.67%的α螺旋、39.62%的无规则卷曲、12.71%的延长链;三级结构由7个α螺旋和无规则卷曲组成。研究结果表明,C B1基因可能是哺乳动物的看家基因,7条相连的α螺旋结构是猪CB1的活性位点。%In the study ,the phylogenetic relationship of the coding sequences (CDS) of CB1 gene between swine and other 21 species ,and the physicochemical characters and structural properties of CB1-encoding protein in swine were analyzed with bioinformatics methods .The results showed that the homology of CB1 gene was high as purifying selection could exist in its evolution .The CB1 protein was a hydrophobic transmembrane protein consisting of 472 amino acid residues without signal peptide .The primary structure of the protein CB1 contained 23 phosphorylation sites and 6 glycosylation sites ,the secondary structure was made up of 47 .67% of α-helix , 39 .62% of random coil ,12 .71% of extended strand ,the tertiary structure was composed of 7α-heli-ces and random coil .The results indicate that CB1 maybe is a housekeeping gene of mammals and the 7 connected α-helices are active sites of CB1 in swine .

  17. 藏绵羊脂蛋白脂酶基因克隆及序列分析%Tibetan Sheep LPL Gene Clone and Bioinformatic Analysis

    Institute of Scientific and Technical Information of China (English)

    高思; 徐亚欧; 毛亮; 邵欢欢; 杨虎林; 舒浩国

    2011-01-01

    [目的]为深入研究藏绵羊肉用性能的遗传调控与营养代谢关系.[方法]利用RT-PCR和T-A克隆技术获得了藏绵羊LPL基因,并对其进行生物信息学分析.[结果]藏绵羊LPL编码基因全长1437 bp,编码478个氨基酸.将藏绵羊LPL基因及氨基酸序列分别与GenBank中公布的11种动物进行序列一致率比对,发现藏绵羊与所选动物的LPL基因序列一致率在84.6%-99.6%,LPL氨基酸序列一致率在88.8%-99.0%.藏绵羊与普通绵羊LPL基因存在6个位点核苷酸差异,其中有一个核苷酸位点的差异没有引起相应氨基酸的改变,其余5个住点核苷酸的不同都引起了氨基酸的差异.[结论]该研究可为了解LPL基因的演化关系及作用机理提供资料.%[ Objective ] The aim was to deeply study the relationship between the genetic regulation of meat performance of Tibetan sheep and nutrition and metabolism. [ Method ] The LPL coding gene of Tibetan sheep was cloned by reverse-translation PCR and T-A clone technology,then it was analyzed by Bioinformatics software. [ Result] The results showed that LPL gene of Tibetan sheep contained 1437 bp nucleotides and encoded 478 amino acids. The multiple sequence alignment such as Tibetan sheep, sheep, goat, cattle, yak, pig, dog, cat, baboon, orangutan, human, Norway rat and rattus showed that the total homologous rate of LPL gene was 84.6% - 99.6%, and the homologous rate of amino acids was 88.8% ~ 99.0%. Moreover,6 different nucleotides were foumd between Tibetan sheep and common sheep. One of these nucleotide was synonymous codon so that the amino acid which the synonymous codon encoded was identical between Tibetan sheep and common sheep,and the other five nucleotides which encoded different amino acids between Tibetan sheep and common sheep. [ Conclusion ] The study can provide reference for knowing the evolution relation of LPL gene and its mechanism of action.

  18. Bioinformatic identification of novel putative photoreceptor specific cis-elements

    OpenAIRE

    Knox Barry E; Qin Maochun; McIlvain Vera A; Danko Charles G; Pertsov Arkady M

    2007-01-01

    Abstract Background Cell specific gene expression is largely regulated by different combinations of transcription factors that bind cis-elements in the upstream promoter sequence. However, experimental detection of cis-elements is difficult, expensive, and time-consuming. This provides a motivation for developing bioinformatic methods to identify cis-elements that could prioritize future experimental studies. Here, we use motif discovery algorithms to predict transcription factor binding site...

  19. 萱草microRNAs生物信息学及与冷冻相关microRNAs的分析%Bioinformatics, Expression and Functional Analysis of microRNAs in Response to Low Temperature in Hemerocallis fulva (L.) L.

    Institute of Scientific and Technical Information of China (English)

    安凤霞; 卢宝伟; 梁鸣; 唐焕伟; 李富恒

    2014-01-01

    MicroRNAs (miRNAs), as endogenous small non-coding single-stranded RNAs of 16-29 nt, play a prominent role in the process of growth, development and responses to environmental stresses in plants. The miRNAs in response to low temperature in Hemerocallis fulva roots were identified using deep-sequencing technique in combination with bioinformatics prediction. A total of 14 843 184 and 16 072 575 RNA sequences were explored under normal and low temperature conditions, which represented 14 064 385 and 15 309 725 types of small RNA (sRNA), respectively. The sRNA showed a normal distribution. Through GenBank and Rfam comparison analysis, rRNA and tRNA accounts for a larger proportion in non-coding RNA. Totally 799 994 sRNA in 67 411 types were annotated under low temperature, and 1 055 466 sRNAs in 66 524 types were annotated under normal temperature. miR393, miR397 and miR396 were up-regulated and miR319 was down-regulated at low temperature. This research provides rich data for illuminating the regulatory mechanism of protein synthesis and screening the key regulatory genes in response to low temperature.%microRNA是一类长度为16~29 nt的非蛋白质编码的内源小分子RNA (sRNA),在植物生长发育以及逆境胁迫响应等过程中发挥着重要作用。本文利用基于HiSeq原理的sRNA深度测序技术,结合生物信息学方法对萱草根系中已知miRNA的类型、丰度以及部分与冷冻胁迫相关的已知miRNA的功能进行了分析。结果表明,在10℃常温和-25℃低温条件下萱草根系中分别有14843184和16072575条序列信息,代表14064385和15309725种sRNA片段,且sRNA均呈现正态分布特征;在非编码RNA中转运RNA (tRNA)、核糖体RNA (rRNA)所占比例较大。低温sRNA组中得到注释的sRNA有67411种,共计799994条sRNA片段;常温sRNA组中,得到注释的sRNA有66524种,共计1055466条sRNA片段。冷冻胁迫下,萱草通过提高miR393、miR397、miR396的表达量

  20. Statistical modelling in biostatistics and bioinformatics selected papers

    CERN Document Server

    Peng, Defen

    2014-01-01

    This book presents selected papers on statistical model development related mainly to the fields of Biostatistics and Bioinformatics. The coverage of the material falls squarely into the following categories: (a) Survival analysis and multivariate survival analysis, (b) Time series and longitudinal data analysis, (c) Statistical model development and (d) Applied statistical modelling. Innovations in statistical modelling are presented throughout each of the four areas, with some intriguing new ideas on hierarchical generalized non-linear models and on frailty models with structural dispersion, just to mention two examples. The contributors include distinguished international statisticians such as Philip Hougaard, John Hinde, Il Do Ha, Roger Payne and Alessandra Durio, among others, as well as promising newcomers. Some of the contributions have come from researchers working in the BIO-SI research programme on Biostatistics and Bioinformatics, centred on the Universities of Limerick and Galway in Ireland and fu...

  1. Multi-Institutional FASTQ File Exchange as a Means of Proficiency Testing for Next-Generation Sequencing Bioinformatics and Variant Interpretation.

    Science.gov (United States)

    Davies, Kurtis D; Farooqi, Midhat S; Gruidl, Mike; Hill, Charles E; Woolworth-Hirschhorn, Julie; Jones, Heather; Jones, Kenneth L; Magliocco, Anthony; Mitui, Midori; O'Neill, Philip H; O'Rourke, Rebecca; Patel, Nirali M; Qin, Dahui; Ramos, Erica; Rossi, Michael R; Schneider, Thomas M; Smith, Geoffrey H; Zhang, Linsheng; Park, Jason Y; Aisner, Dara L

    2016-07-01

    Next-generation sequencing is becoming increasingly common in clinical laboratories worldwide and is revolutionizing clinical molecular testing. However, the large amounts of raw data produced by next-generation sequencing assays and the need for complex bioinformatics analyses present unique challenges. Proficiency testing in clinical laboratories has traditionally been designed to evaluate assays in their entirety; however, it can be alternatively applied to separate assay components. We developed and implemented a multi-institutional proficiency testing approach to directly assess custom bioinformatics and variant interpretation processes. Six clinical laboratories, all of which use the same commercial library preparation kit for next-generation sequencing analysis of tumor specimens, each submitted raw data (FASTQ files) from four samples. These 24 file sets were then deidentified and redistributed to five of the institutions for analysis and interpretation according to their clinically validated approach. Among the laboratories, there was a high rate of concordance in the calling of single-nucleotide variants, in particular those we considered clinically significant (100% concordance). However, there was significant discordance in the calling of clinically significant insertions/deletions, with only two of seven being called by all participating laboratories. Missed calls were addressed by each laboratory to improve their bioinformatics processes. Thus, through our alternative proficiency testing approach, we identified the bioinformatic detection of insertions/deletions as an area of particular concern for clinical laboratories performing next-generation sequencing testing. PMID:27155050

  2. Protein expression and bioinformatics analysis of stk40 gene related to embryo development%胚胎发育相关基因stk40的蛋白表达和生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    崔骥; 张军强; 陈洁; 朱丹丹; 郭锡熔; 童国庆

    2011-01-01

    Objective To detcet the protein expression of stk40 and analyze the bioinformatics of embryo development-related gene.Methods Western blot was performed to detcet the protein expression of stk40 in the early developmental embryos of mouse.An initial bioinformatics analysis was performed on its gene structure, genome localization, the physical and chemical characteristics of its coding protein, secondary structure, hydrophobicity/hydrophilicity, structural domain and so on.Results It was demonstrated that the protein expression of stk40 was lower in the developmentarrested 8-cell embryos than that in the normal ones.Bioinformatics analysis showed that stk40 gene was a 3877 bp mRNA,containing 1350 nucleotides of an open reading frame predicting 449 amino acids with a molecular mass of 50563.9.NCBI Map Viewer analysis revealed that the stk40 gene was located on chromosome 4D2.2 and was composed of 13 exons and 12 introns.The stk40 had a STYKc domain related to emergence of cellular organisms.Conclusion The detcetion and analysis of stk40 gene may provide foundation and novel information for the further study.%目的 检测胚胎发育相关基因 stk40 的表达,并对其进行生物信息学分析.方法 取小鼠早期各发育阶段的胚胎样本,用Western blot方法检测stk40的蛋白表达.用生物信息学软件或数据库分析预测stk40基因及其编码蛋白的基因结构、染色体定位、蛋白质理化性质、二级结构、疏水性/亲水性及结构域.结果 证实stk40蛋白在小鼠8细胞发育阻滞胚胎中的表达显著低于发育正常的早期胚胎.生物信息学分析显示,stk40基因mRNA全长3877 bp,开放阅读框长1350 bp,编码449个氨基酸,相对分子质量50563.9;定位于染色体4D2.2区域,含13个外显子和12个内含子.蛋白结构域分析提示,stk40基因编码蛋白存在-STYKc结构域,可能与细胞有机体的发生有关.结论 stk40基因编码蛋白的成功检测及生物信息学分析为进一

  3. Cloning and Bioinformatics Analysis of Transcription Factor DdrO in Deinococcus Radiodurans%耐辐射球菌转录因子DdrO的基因克隆与生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    杜邱; 何淑雅; 马云; 李斌元; 孙晓宇; 廖端芳

    2011-01-01

    Objective: To clone the Deinococcus radiodurans ddrO gene, and predict its function by bioinformatics analysis. Methods: According to the published ddrO gene sequence of Deinococcus radiodurans, by using the software of Primer Premier 5, a pair of primers were designed and synthesized. By using the genomic DNA isolated from Deinococcus radiodurans as templates for polymerase chain reaction (PCR), and Deinococcus radiodurans ddrO gene were gained. Sequenced and various bioinformatics softwares were employed to analyze and predict its physicochemical properties, advanced structure and biological function. Results: The ddrO gene was successfully obtained. Bioinformafics analysis revealed that ddrO nucleotide sequence length was 396bp and encoded a transcription factor DdrO containing 131 amino acid with a molecular weight of 14.993 KD. Nucleic acid homology search and comparative analysis showed that highly similar sequences were found only belong to Deinococcus geothermalis and Deinococcus deserti, which are the same genus with DR; some significant homology of the DdrO protein were found by Protein homology search, such as Deide_20570 (95%), Dgeo_0336 (90%), Deide_3p02170 (82%), etc.; and domain analysis showed that DdrO containing a HTH (helix-turn-helix) DNA-binding domain.Conclusion: Based on the results of bioinformatics, we predict that DdrO protein may have transcriptional regulatory function, possibly through a mechanism involved in the DNA repair and replication in Deinococcus radiodurans and played an important role in the process of DNA damage repair.%目的:克隆耐辐射球菌ddrO基因,并时其进行生物信息学分析,预测其功能.方法:根据耐辐射球菌ddrO基因序列.由PrimerPremier5设计一对引物,以提取的耐辐射球菌基因组为模板,PCR扩增获得耐辐射球菌ddrO基因,序列测定并利用生物信息学软件对ddrO基因的理化性质、高级结构及生物学功能等进行分析与预测.结果:成功获

  4. Differential Expression of Proteins Associated with the Hair Follicle Cycle - Proteomics and Bioinformatics Analyses.

    Directory of Open Access Journals (Sweden)

    Lei Wang

    Full Text Available Hair follicle cycling can be divided into the following three stages: anagen, catagen, and telogen. The molecular signals that orchestrate the follicular transition between phases are still unknown. To better understand the detailed protein networks controlling this process, proteomics and bioinformatics analyses were performed to construct comparative protein profiles of mouse skin at specific time points (0, 8, and 20 days. Ninety-five differentially expressed protein spots were identified by MALDI-TOF/TOF as 44 proteins, which were found to change during hair follicle cycle transition. Proteomics analysis revealed that these changes in protein expression are involved in Ca2+-regulated biological processes, migration, and regulation of signal transduction, among other processes. Subsequently, three proteins were selected to validate the reliability of expression patterns using western blotting. Cluster analysis revealed three expression patterns, and each pattern correlated with specific cell processes that occur during the hair cycle. Furthermore, bioinformatics analysis indicated that the differentially expressed proteins impacted multiple biological networks, after which detailed functional analyses were performed. Taken together, the above data may provide insight into the three stages of mouse hair follicle morphogenesis and provide a solid basis for potential therapeutic molecular targets for this hair disease.

  5. Computational biology and bioinformatics in Nigeria.

    Directory of Open Access Journals (Sweden)

    Segun A Fatumo

    2014-04-01

    Full Text Available Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries.

  6. BioWarehouse: a bioinformatics database warehouse toolkit

    Directory of Open Access Journals (Sweden)

    Stringer-Calvert David WJ

    2006-03-01

    Full Text Available Abstract Background This article addresses the problem of interoperation of heterogeneous bioinformatics databases. Results We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. Conclusion BioWarehouse embodies significant progress on the

  7. Bioinformatic analysis of regulation of microRNA on target genes in pediatric asthma%microRNA对儿童哮喘靶基因调控的生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    董晓艳; 陆权; 张慧燕; 顾坚磊; 钟南

    2016-01-01

    目的:应用生物信息学技术分析尘螨过敏哮喘儿童特异性microRNA(miRNA)及其靶基因筛选,探讨哮喘发病机制。方法采用病例对照研究,在62对尘螨过敏哮喘患儿及同龄正常无过敏儿童中,随机选取12例哮喘患儿及对照者进行microRNA芯片分析,比较两组中存在异常表达的miRNAs,并在其余病例中进行RT-qPCR验证和生物信息学分析。结果尘螨过敏哮喘儿童中有6个microRNA表达较对照组下调2倍以上,分别为miRNA-151a-5p、625-5p、126-3p、513a-5p、27b-3p、22-3p,差异均有统计学意义(P<0.05)。进一步的生物信息学富集分析发现,这些microRNAs调控的PPARGC1B、CBL、ONECUT2、ESR1、EGFR、SYK、STAT1与炎症因子信号通路有着显著性关联(P<0.05)。结论 miRNA-625-5p、513a-5p、27b-3p、22-3p可能通过共同调控相关靶基因,形成一个网络通路,参与尘螨诱发儿童哮喘的发生。%Objective To understand the underlying mechanism of mites-induced pediatric asthma by bioinformatic analysis on speciifc microRNA (miRNA) array and target gene screening. Methods This is a case control study of 62 pairs of dust mites-induced asthma children with age and gender matched healthy controls. Twelve pairs were randomly selected for miRNA array. The abnormal expression of miRNAs was compared between asthma and control children. The results were validated by RT-qPCR and bioinformatic analysis in remaining pairs of children. Results Six miRNAs (miRNA-151a-5p, 625-5p, 126-3p, 513a-5p, 27b-3p, 22-3p) were signiifcantly down-regulated more than two folds in dust mites-induced asthma children than those in controls. The enriched bioinformatics analysis showed that these miRNAs and their target genes CBL, PPARGC1B, ESR1, ONECUT2, EGFR, SYK, and STAT1 were related to inlfammatory cytokine signaling pathway. Conclusion It is suggested that miR-22-3p, 513a-5p, 625-5p, 27b-3p, and miRNA-target genes form a network

  8. A Bioinformatic Analysis on Caffeine Synthase in Plants%植物咖啡碱合成酶的生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    孔祥瑞; 杨军; 王让剑

    2014-01-01

    The amino acid sequences of caffeine synthase from Camellia sinensis ,Theobroma cacao ,Camellia japonica and other plants which were registered in GenBank,were analyzed and predicted by bioinformatic tools in subsequent aspects, including the isoelectric point, subcellular localization, signal peptide, transmembrane topologieal structure,conserved functional domain,motif,secondary structure and tertiary structure of protein. Results showed that the caffeine synthase of plants which were located in cytoplasm and nuclei, and had phosphorylation,acylation,glycosylation sites could be divided into three different types based on gene sequences and conservative domains.Two of them,type I and type II protein,were α-type soluble proteinases,and the secondary structure of type III proteinase was rich in random coil and has potential signal peptide,but they all did not have transmembrane helical structure.The result of tertiary structure prediction indicated that type I protein and type II protein were similar,they were all composed of α-helix and horizontal β-folded layers,but in the type III protein the α-helixes locateed in the lateral ends and were connected by vertical β-folded layers.%采用生物信息学分析方法对 GenBank 中来源于茶树、可可、山茶等植物咖啡碱合成酶的氨基酸序列进行比对分析,就等电点、亚细胞定位、信号肽、跨膜螺旋、保守性功能结构域及基序、二级结构与三级结构等重要参数进行预测与分析。结果表明,植物咖啡碱合成酶主要定位于胞质和胞核中,含有磷酸化、酰基化和糖基化修饰位点,基于基因序列与保守结构域可被分成3种类型,其中 I 型与 II 型酶蛋白均属全α型水溶性酶蛋白,III 型酶蛋白除二级结构富含无规卷曲构件,还极有可能存在信号肽序列,但3类酶蛋白均无跨膜螺旋,三级结构预测显示,I 型、II 型酶蛋白极为相似,由α螺旋和横

  9. Forensic Bioinformatics: An innovative technological advancement in the field of Forensic Medicine and Diagnosis

    OpenAIRE

    Kumar Ajay; Singh Neetu; Gaurav S.S

    2012-01-01

    Background: The role of Bioinformatics in this modern age of technology advancement can not be over-emphasized. Aim: This study reviews the principle, techniques, and applications of Forensic Bioinformatics. Methods and Materials: Literature searches were done to identify relevant studies. Results: The concepts of sequence annotation and whole genome sequencing were possible due to the assimilation of software based tools which are exclusively responsible for the segregation of bulk genomic d...

  10. When cloud computing meets bioinformatics: a review.

    Science.gov (United States)

    Zhou, Shuigeng; Liao, Ruiqi; Guan, Jihong

    2013-10-01

    In the past decades, with the rapid development of high-throughput technologies, biology research has generated an unprecedented amount of data. In order to store and process such a great amount of data, cloud computing and MapReduce were applied to many fields of bioinformatics. In this paper, we first introduce the basic concepts of cloud computing and MapReduce, and their applications in bioinformatics. We then highlight some problems challenging the applications of cloud computing and MapReduce to bioinformatics. Finally, we give a brief guideline for using cloud computing in biology research.

  11. Translational Bioinformatics and Clinical Research (Biomedical) Informatics.

    Science.gov (United States)

    Sirintrapun, S Joseph; Zehir, Ahmet; Syed, Aijazuddin; Gao, JianJiong; Schultz, Nikolaus; Cheng, Donavan T

    2016-03-01

    Translational bioinformatics and clinical research (biomedical) informatics are the primary domains related to informatics activities that support translational research. Translational bioinformatics focuses on computational techniques in genetics, molecular biology, and systems biology. Clinical research (biomedical) informatics involves the use of informatics in discovery and management of new knowledge relating to health and disease. This article details 3 projects that are hybrid applications of translational bioinformatics and clinical research (biomedical) informatics: The Cancer Genome Atlas, the cBioPortal for Cancer Genomics, and the Memorial Sloan Kettering Cancer Center clinical variants and results database, all designed to facilitate insights into cancer biology and clinical/therapeutic correlations.

  12. Analysis of Maize Crop Leaf using Multivariate Image Analysis for Identifying Soil Deficiency

    Directory of Open Access Journals (Sweden)

    S. Sridevy

    2014-11-01

    Full Text Available Image processing analysis for the soil deficiency identification has become an active area of research in this study. The changes in the color of the leaves are used to analyze and identify the deficiency of soil nutrients such as Nitrogen (N, Phosphorus (P and potassium (K by digital color image analysis. This research study focuses on the image analysis of the maize crop leaf using multivariate image analysis. In this proposed novel approach, initially, a color transformation for the input RGB image is formed and this RGB is converted to HSV because RGB is ideal for color generation but HSV is very suitable for color perception. Then green pixels are masked and removed using specific threshold value by applying histogram equalization. This masking approach is done through specific customized filtering approach which exclusively filters the green color of the leaf. After the filtering step, only the deficiency part of the leaf is taken for consideration. Then, a histogram generation is carried out for the deficiency part of the leaf. Then, Multivariate Image Analysis approach using Independent Component Analysis (ICA is carried out to extract a reference eigenspace from a matrix built by unfolding color data from the deficiency part. Test images are also unfolded and projected onto the reference eigenspace and the result is a score matrix which is used to compute nutrient deficiency based on the T2 statistic. In addition, a multi-resolution scheme by scaling down process is carried out to speed up the process. Finally, based on the training samples, the soil deficiency is identified based on the color of the maize crop leaf.

  13. Bioinformatics Analysis of Glutathione S-transferase Gene of Taenia saginata%牛带绦虫成虫谷胱甘肽S-转移酶基因的生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    王宇; 黄江; 戴佳琳; 廖兴江

    2012-01-01

    Objective: To analyze gene structure of glutathione S-transferase (GST) of Taenia sagi-nata, and to predict the structure and function of its encoded protein. Methods: Bioinformatics analy-sis tools in bioinformatics webs such as NCBI and ExPASY combined with some other analysis softwares were used. Results: The full length of this gene was 908 bp. Its coding region was 135 -771 bp, en-coding 212 ammo acids. The encoded protein didn't contain any kinds of subcellular localization se-quence. Consistency and similarity of the screened gene with that of Taenia solium GST were 93% and 96% respectively. Three major epitopes of GST: 33 -53 aa, 62 -68 aa, 179 ~ 184 aa were predicted to locate on the surface of GST spatial structure and were far away from each other. Conclusions; GST gene is screened from cDNA library of adult Taenia saginata. GST is predicted to be a cytosolic protein and has good application prospect for immunodiagnosis.%目的:分析牛带绦虫成虫谷胱甘肽S-转移酶(GST)基因结构并预测其编码蛋白的结构和功能.方法:利用生物信息学网站如NCBI和ExPASY系统中的生物信息学分析工具,并结合其它分析软件,分析该基因的结构并预测其编码蛋白质的结构和功能.结果:该基因全长908bp,编码区为135~771bp,编码212个氨基酸,无各种亚细胞定位序列;与猪带绦虫GST的一致性为93%,相似性为96%;预测3个主要的抗原表位33~53aa,62~68aa,179~184aa位于空间结构上相距较远的分子表面.结论:从牛带绦虫成虫Cdna文库中筛选出GST基因,预测为胞浆型蛋白,可能具有较好的免疫学诊断抗原应用前景.

  14. Phosphoproteomics and bioinformatics analyses of spinal cord proteins in rats with morphine tolerance.

    Directory of Open Access Journals (Sweden)

    Wen-Jinn Liaw

    Full Text Available INTRODUCTION: Morphine is the most effective pain-relieving drug, but it can cause unwanted side effects. Direct neuraxial administration of morphine to spinal cord not only can provide effective, reliable pain relief but also can prevent the development of supraspinal side effects. However, repeated neuraxial administration of morphine may still lead to morphine tolerance. METHODS: To better understand the mechanism that causes morphine tolerance, we induced tolerance in rats at the spinal cord level by giving them twice-daily injections of morphine (20 µg/10 µL for 4 days. We confirmed tolerance by measuring paw withdrawal latencies and maximal possible analgesic effect of morphine on day 5. We then carried out phosphoproteomic analysis to investigate the global phosphorylation of spinal proteins associated with morphine tolerance. Finally, pull-down assays were used to identify phosphorylated types and sites of 14-3-3 proteins, and bioinformatics was applied to predict biological networks impacted by the morphine-regulated proteins. RESULTS: Our proteomics data showed that repeated morphine treatment altered phosphorylation of 10 proteins in the spinal cord. Pull-down assays identified 2 serine/threonine phosphorylated sites in 14-3-3 proteins. Bioinformatics further revealed that morphine impacted on cytoskeletal reorganization, neuroplasticity, protein folding and modulation, signal transduction and biomolecular metabolism. CONCLUSIONS: Repeated morphine administration may affect multiple biological networks by altering protein phosphorylation. These data may provide insight into the mechanism that underlies the development of morphine tolerance.

  15. BioJava: an open-source framework for bioinformatics

    OpenAIRE

    Holland, R. C. G.; Down, T. A.; Pocock, M.; Prlić, A.; Huen, D; James, K.; Foisy, S.; Dräger, A.; Yates, A; Heuer, M.; Schreiber, M. J.

    2008-01-01

    Summary: BioJava is a mature open-source project that provides a framework for processing of biological data. BioJava contains powerful analysis and statistical routines, tools for parsing common file formats and packages for manipulating sequences and 3D structures. It enables rapid bioinformatics application development in the Java programming language. Availability: BioJava is an open-source project distributed under the Lesser GPL (LGPL). BioJava can be downloaded from the BioJava website...

  16. RNA-seq analysis to identify novel roles of scleraxis during embryonic mouse heart valve remodeling.

    Directory of Open Access Journals (Sweden)

    Damien N Barnette

    Full Text Available Heart valve disease affects up to 30% of the population and has been shown to have origins during embryonic development. Valvulogenesis begins with formation of endocardial cushions in the atrioventricular canal and outflow tract regions. Subsequently, endocardial cushions remodel, elongate and progressively form mature valve structures composed of a highly organized connective tissue that provides the necessary biomechanical function throughout life. While endocardial cushion formation has been well studied, the processes required for valve remodeling are less well understood. The transcription factor Scleraxis (Scx is detected in mouse valves from E15.5 during initial stages of remodeling, and expression remains high until birth when formation of the highly organized mature structure is complete. Heart valves from Scx-/- mice are abnormally thick and develop fibrotic phenotypes similar to human disease by juvenile stages. These phenotypes begin around E15.5 and are associated with defects in connective tissue organization and valve interstitial cell differentiation. In order to understand the etiology of this phenotype, we analyzed the transcriptome of remodeling valves isolated from E15.5 Scx-/- embryos using RNA-seq. From this, we have identified a profile of protein and non-protein mRNAs that are dependent on Scx function and using bioinformatics we can predict the molecular functions and biological processes affected by these genes. These include processes and functions associated with gene regulation (methyltransferase activity, DNA binding, Notch signaling, vitamin A metabolism (retinoic acid biosynthesis and cellular development (cell morphology, cell assembly and organization. In addition, several mRNAs are affected by alternative splicing events in the absence of Scx, suggesting additional roles in post-transcriptional modification. In summary, our findings have identified transcriptome profiles from abnormal heart valves isolated

  17. Identification of plasma lipid biomarkers for prostate cancer by lipidomics and bioinformatics.

    Directory of Open Access Journals (Sweden)

    Xinchun Zhou

    Full Text Available BACKGROUND: Lipids have critical functions in cellular energy storage, structure and signaling. Many individual lipid molecules have been associated with the evolution of prostate cancer; however, none of them has been approved to be used as a biomarker. The aim of this study is to identify lipid molecules from hundreds plasma apparent lipid species as biomarkers for diagnosis of prostate cancer. METHODOLOGY/PRINCIPAL FINDINGS: Using lipidomics, lipid profiling of 390 individual apparent lipid species was performed on 141 plasma samples from 105 patients with prostate cancer and 36 male controls. High throughput data generated from lipidomics were analyzed using bioinformatic and statistical methods. From 390 apparent lipid species, 35 species were demonstrated to have potential in differentiation of prostate cancer. Within the 35 species, 12 were identified as individual plasma lipid biomarkers for diagnosis of prostate cancer with a sensitivity above 80%, specificity above 50% and accuracy above 80%. Using top 15 of 35 potential biomarkers together increased predictive power dramatically in diagnosis of prostate cancer with a sensitivity of 93.6%, specificity of 90.1% and accuracy of 97.3%. Principal component analysis (PCA and hierarchical clustering analysis (HCA demonstrated that patient and control populations were visually separated by identified lipid biomarkers. RandomForest and 10-fold cross validation analyses demonstrated that the identified lipid biomarkers were able to predict unknown populations accurately, and this was not influenced by patient's age and race. Three out of 13 lipid classes, phosphatidylethanolamine (PE, ether-linked phosphatidylethanolamine (ePE and ether-linked phosphatidylcholine (ePC could be considered as biomarkers in diagnosis of prostate cancer. CONCLUSIONS/SIGNIFICANCE: Using lipidomics and bioinformatic and statistical methods, we have identified a few out of hundreds plasma apparent lipid molecular

  18. Concepts and introduction to RNA bioinformatics

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Hofacker, Ivo L.; Ruzzo, Walter L.

    2014-01-01

    RNA bioinformatics and computational RNA biology have emerged from implementing methods for predicting the secondary structure of single sequences. The field has evolved to exploit multiple sequences to take evolutionary information into account, such as compensating (and structure preserving) base...... for interactions between RNA and proteins.Here, we introduce the basic concepts of predicting RNA secondary structure relevant to the further analyses of RNA sequences. We also provide pointers to methods addressing various aspects of RNA bioinformatics and computational RNA biology....

  19. Bioinformatics for saffron (Crocus sativus L.) improvement

    OpenAIRE

    Ghulam A. PARRAY; Abdul G. Rather; Parvez Sofi; Shafiq A. Wani; Amjad M. Husaini; Asif B. Shikari; Javid I. Mir

    2009-01-01

    Saffron (Crocus sativus L.) is a sterile triploid plant and belongs to the Iridaceae (Liliales, Monocots). Its genome is of relatively large size and is poorly characterized. Bioinformatics can play an enormous technical role in the sequence-level structural characterization of saffron genomic DNA. Bioinformatics tools can also help in appreciating the extent of diversity of various geographic or genetic groups of cultivated saffron to infer relationships between groups and accessions. The ch...

  20. In Silico Analysis of Gene Expression Network Components Underlying Pigmentation Phenotypes in the Python Identified Evolutionarily Conserved Clusters of Transcription Factor Binding Sites

    Science.gov (United States)

    2016-01-01

    Color variation provides the opportunity to investigate the genetic basis of evolution and selection. Reptiles are less studied than mammals. Comparative genomics approaches allow for knowledge gained in one species to be leveraged for use in another species. We describe a comparative vertebrate analysis of conserved regulatory modules in pythons aimed at assessing bioinformatics evidence that transcription factors important in mammalian pigmentation phenotypes may also be important in python pigmentation phenotypes. We identified 23 python orthologs of mammalian genes associated with variation in coat color phenotypes for which we assessed the extent of pairwise protein sequence identity between pythons and mouse, dog, horse, cow, chicken, anole lizard, and garter snake. We next identified a set of melanocyte/pigment associated transcription factors (CREB, FOXD3, LEF-1, MITF, POU3F2, and USF-1) that exhibit relatively conserved sequence similarity within their DNA binding regions across species based on orthologous alignments across multiple species. Finally, we identified 27 evolutionarily conserved clusters of transcription factor binding sites within ~200-nucleotide intervals of the 1500-nucleotide upstream regions of AIM1, DCT, MC1R, MITF, MLANA, OA1, PMEL, RAB27A, and TYR from Python bivittatus. Our results provide insight into pigment phenotypes in pythons. PMID:27698666

  1. Bioinformatics Analysis on the Structure and Function of Malate Dehydrogenase Gene of Taenia solium%生物信息学法分析猪带绦虫苹果酸脱氢酶结构与功能

    Institute of Scientific and Technical Information of China (English)

    蓝磊; 廖兴江; 黄江; 戴佳琳

    2012-01-01

    目的:分析和预测猪带绦虫苹果酸脱氢酶的结构和特性,用于指导其生物学功能的实验研究.方法:利用美国国家生物技术信息中心和瑞士生物信息学研究所的蛋白分析专家系统中有关基因和蛋白的序列和结构信息分析的工具,结合Pcgene和Vector NTI suite生物信息学分析软件包,从猪带绦虫全长cDNA质粒文库中识别苹果酸脱氢酶基因及其编码区,分析、预测该基因编码的蛋白质的理化特性、翻译后的修饰位点、功能域、亚细胞定位、拓扑结构、二级结构、三维空间构象等.结果:该基因编码332个氨基酸,为全长基因.GenBank中与细粒棘球绦虫苹果酸脱氢酶序列同源性最高,理论分子量为36459.2 Da.预测编码蛋白无跨膜区,无二硫键,稳定性较好.与吸虫属的苹果酸脱氢酶进化关系最近.结论:应用生物信息方法从猪带绦虫成虫Cd-NA文库中筛选出了猪带绦虫核糖体Cdna全长序列并预测得到其结构与功能方面信息.%Objective: To analyze and predict the structure and characteristics of Taenia solium mal-ate dehydrogenase ( MDH) , and so as to guide the experimental research on biological function of MDH. Methods: Tools about informatics analyis on sequences and structures of gene and protein in protein analysis expert system of bioinformatic institute of Switzerland, and those of state biological and technology information center of USA, combined with Pcgene and Vector NTI suite bioinformatics soft-ware pakege were employed to screen Taenia solium MDH gene and encoding region from cDNA plas-mid library to analyze and predict physicochemical properties of its encoding protein, modification site after translation, function domains, subcelluar location, topological structure, secondary structure, and 3D conformation and so on. Results: This gene encoded 332 amino acids, and was a full length gene. It was the most homologues to Taenia echinococcus MDH in Gen

  2. 肿瘤相关巨噬细胞microRNA表达谱及生物信息学分析%Profile of microRNA expression in tumor associated macrophage and bioinformatics analysis

    Institute of Scientific and Technical Information of China (English)

    雷宇; 刘彦信; 葛晔华; 史娟; 郑德先

    2012-01-01

    Objective To investigate the profile of microRNA expression in tumor associated macrophage (TAM). Methods An xenograft mouse model was established with mouse breast cancer cell line 4T1. TAM were isolated from the tumor tissue. The microRNA expression profile was detected by using a microRNA chip assay. The result of chip assay was validated by real-time PCR and analyzed by bioinformatics. The peritoneal macrophage was used as control. Results There were significant changes in 59 microRNAs' expression in TAM as compared with the negative control. Among these microRNAs, 23 microRNAs' expression was up regulated and 36 were down regulated. Real-time PCR verified the expression of miR-146a, miR-222, miR-31 and miR-877, these results are in line with chip experiment. These microRNAs participate in the regulation of various signaling pathways. Conclusions Profile of microRNA expression and bioinformatics analysis suggeste microRNA plays an important role in the regulation of TAM differentiation.%目的 研究肿瘤相关巨噬细胞( TAM) microRNA的表达谱.方法 建立小鼠乳腺癌细胞系4T1移植瘤模型,从移植瘤组织中分离TAM,用基因芯片检测microRNA表达谱,实时荧光定量PCR( real-time PCR)验证芯片结果并进行生物信息学分析,以小鼠腹腔巨噬细胞(PEC)为阴性对照.结果 与阴性对照细胞相比,TAM中有59个microRNAs表达量出现显著变化,其中23个microRNAs表达上调,有36个microRNAs表达下调;实时荧光定量PCR对miR-146a、miR-222、miR-31和miR-877的表达进行了验证,其结果与基因芯片检测结果一致;这些microRNAs参与了多个信号通路的调控.结论 microRNA表达谱及生物信息学分析表明microRNA在TAM分化过程的调控中有重要作用.

  3. Bioinformatics: Cheap and robust method to explore biomaterial from Indonesia biodiversity

    Science.gov (United States)

    Widodo

    2015-02-01

    Indonesia has a huge amount of biodiversity, which may contain many biomaterials for pharmaceutical application. These resources potency should be explored to discover new drugs for human wealth. However, the bioactive screening using conventional methods is very expensive and time-consuming. Therefore, we developed a methodology for screening the potential of natural resources based on bioinformatics. The method is developed based on the fact that organisms in the same taxon will have similar genes, metabolism and secondary metabolites product. Then we employ bioinformatics to explore the potency of biomaterial from Indonesia biodiversity by comparing species with the well-known taxon containing the active compound through published paper or chemical database. Then we analyze drug-likeness, bioactivity and the target proteins of the active compound based on their molecular structure. The target protein was examined their interaction with other proteins in the cell to determine action mechanism of the active compounds in the cellular level, as well as to predict its side effects and toxicity. By using this method, we succeeded to screen anti-cancer, immunomodulators and anti-inflammation from Indonesia biodiversity. For example, we found anticancer from marine invertebrate by employing the method. The anti-cancer was explore based on the isolated compounds of marine invertebrate from published article and database, and then identified the protein target, followed by molecular pathway analysis. The data suggested that the active compound of the invertebrate able to kill cancer cell. Further, we collect and extract the active compound from the invertebrate, and then examined the activity on cancer cell (MCF7). The MTT result showed that the methanol extract of marine invertebrate was highly potent in killing MCF7 cells. Therefore, we concluded that bioinformatics is cheap and robust way to explore bioactive from Indonesia biodiversity for source of drug and another

  4. Cloning and Bioinformatics Analysis on CDS of CYGB Gene in Yak%牦牛CYGB基因CDS区克隆与生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    孙雪婧; 杜晓华; 杨孝朴; 罗玉柱; 刘霞

    2014-01-01

    因子调控的作用。牦牛CYGB氨基酸序列与普通牛、绵羊、家犬、小鼠、褐家鼠、原鸡、猴、黑猩猩、人的CYGB氨基酸序列的同源性分别为100%、98.9%、97.8%、95.3%、93.7%、78.8%、98.4%、95.8%和96.8%,物种之间同源性较高,系统进化情况与其亲缘关系远近一致,说明CYGB基因编码区在进化过程中比较保守。通过RT-PCR与TA克隆技术及核酸测序技术获得了牦牛CYGB基因全长573 bp的CDS区,并对其核苷酸序列和编码蛋白氨基酸序列及其蛋白结构和功能进行了分析,得知牦牛的CYGB是一个由190个氨基酸残基构成的可溶酸性蛋白质,在能量代谢和辅因子生物合成过程中发挥重要作用。CYGB基因编码区在长期生物进化过程中具有较强的保守性。该基因的成功克隆及分析为揭示牦牛CYGB基因的遗传特性提供了理论依据。%Objective In order to enrich basic data in yak CYGB gene, CDS region of yak CYGB gene was cloned and analyzed by bioinformatics method. [Method] Total RNA of yak hippocampus tissue was extracted and reverse transcribed into cDNA by RT-PCR technology. Specific primers were designed according to cDNA sequence of cattle CYGB gene in the GenBank (GenBank accession No.:DV874786.1) by online software Primer 3.0. The CDS region and part of 5′UTR and 3′UTR in yak CYGB gene were cloned from yak hippocampus total RNA by PCR amplification, TA cloning and nucleic acid sequencing technology. The primary structure, secondary structure, tertiary structure, physicochemical properties, homology were analyzed and phylogenetic tree of CYGB was constructed by online software like ProtParam, PredictProtein, SWISS-MODEL and Lasergene7.1 software package. The three-dimensional structure was modified and output by PyMol software. The protein subcellular localization was predicted by online subcellular localization tool PSORT II Prediction, and the protein function was predicted by Protfun

  5. Overview of Random Forest Methodology and Practical Guidance with Emphasis on Computational Biology and Bioinformatics

    OpenAIRE

    Boulesteix, Anne-Laure; Janitza, Silke; Kruppa, Jochen; König, Inke R.

    2012-01-01

    The Random Forest (RF) algorithm by Leo Breiman has become a standard data analysis tool in bioinformatics. It has shown excellent performance in settings where the number of variables is much larger than the number of observations, can cope with complex interaction structures as well as highly correlated variables and returns measures of variable importance. This paper synthesizes ten years of RF development with emphasis on applications to bioinformatics and computational biology. Specia...

  6. Bioinformatics for whole-genome shotgun sequencing of microbial communities.

    Directory of Open Access Journals (Sweden)

    Kevin Chen

    2005-07-01

    Full Text Available The application of whole-genome shotgun sequencing to microbial communities represents a major development in metagenomics, the study of uncultured microbes via the tools of modern genomic analysis. In the past year, whole-genome shotgun sequencing projects of prokaryotic communities from an acid mine biofilm, the Sargasso Sea, Minnesota farm soil, three deep-sea whale falls, and deep-sea sediments have been reported, adding to previously published work on viral communities from marine and fecal samples. The interpretation of this new kind of data poses a wide variety of exciting and difficult bioinformatics problems. The aim of this review is to introduce the bioinformatics community to this emerging field by surveying existing techniques and promising new approaches for several of the most interesting of these computational problems.

  7. 2nd Colombian Congress on Computational Biology and Bioinformatics

    CERN Document Server

    Cristancho, Marco; Isaza, Gustavo; Pinzón, Andrés; Rodríguez, Juan

    2014-01-01

    This volume compiles accepted contributions for the 2nd Edition of the Colombian Computational Biology and Bioinformatics Congress CCBCOL, after a rigorous review process in which 54 papers were accepted for publication from 119 submitted contributions. Bioinformatics and Computational Biology are areas of knowledge that have emerged due to advances that have taken place in the Biological Sciences and its integration with Information Sciences. The expansion of projects involving the study of genomes has led the way in the production of vast amounts of sequence data which needs to be organized, analyzed and stored to understand phenomena associated with living organisms related to their evolution, behavior in different ecosystems, and the development of applications that can be derived from this analysis.  .

  8. Experiences with workflows for automating data-intensive bioinformatics.

    Science.gov (United States)

    Spjuth, Ola; Bongcam-Rudloff, Erik; Hernández, Guillermo Carrasco; Forer, Lukas; Giovacchini, Mario; Guimera, Roman Valls; Kallio, Aleksi; Korpelainen, Eija; Kańduła, Maciej M; Krachunov, Milko; Kreil, David P; Kulev, Ognyan; Łabaj, Paweł P; Lampa, Samuel; Pireddu, Luca; Schönherr, Sebastian; Siretskiy, Alexey; Vassilev, Dimitar

    2015-01-01

    High-throughput technologies, such as next-generation sequencing, have turned molecular biology into a data-intensive discipline, requiring bioinformaticians to use high-performance computing resources and carry out data management and analysis tasks on large scale. Workflow systems can be useful to simplify construction of analysis pipelines that automate tasks, support reproducibility and provide measures for fault-tolerance. However, workflow systems can incur significant development and administration overhead so bioinformatics pipelines are often still built without them. We present the experiences with workflows and workflow systems within the bioinformatics community participating in a series of hackathons and workshops of the EU COST action SeqAhead. The organizations are working on similar problems, but we have addressed them with different strategies and solutions. This fragmentation of efforts is inefficient and leads to redundant and incompatible solutions. Based on our experiences we define a set of recommendations for future systems to enable efficient yet simple bioinformatics workflow construction and execution. PMID:26282399

  9. 番茄查尔酮合成酶基因的鉴定及生物信息学分析%Identification and Bioinformatics Analysis of Chalcone Synthase Genes in Tomato

    Institute of Scientific and Technical Information of China (English)

    阮美颖; 杨悦俭; 万红建; 叶青静; 王荣青; 姚祝平; 周国治; 俞锞; 袁伟; 刘云飞

    2013-01-01

      类黄酮(Flavonoids)是植物体内一类重要的次生代谢产物,它以结合态(黄酮苷)或自由态(黄酮苷元)形式存在于水果、蔬菜、豆类和茶叶等许多植物中,对植物的生长发育有着重要的调节作用。查尔酮合成酶(Chalcone synthase, CHS, EC2.3.1.74)是植物类黄酮合成途径的第一个关键酶,在调控类黄酮的生物合成以及类黄酮的成分起着决定作用。本研究基于番茄全基因组测序数据,利用生物信息学方法,鉴定了查尔酮合成酶基因家族成员,分析其内含子-外显子的结构特征、系统发育关系,序列结构的保守性以及染色体上的分布。研究表明:查尔酮合成酶(SlCHS)是含有8个成员的多家族基因,蛋白质序列编码位于160(SlCHS05)~438(SlCHS08)个氨基酸之间;相似性在33.7%(SlCHS02和SlCHS06)~92.0%(SlCHS04和SlCHS07)之间,表明这些序列之间具有较高的遗传多样性;此外,结构分析发现这些基因均含有较少的内含子(0~2个);序列比对表明这些基因具有较高的保守性;它们不均匀分布在番茄的1、5、6、9和12号染色体上。该研究不仅有助于未来了解该基因家族的进化起源提供参考,而且可为我们进一步分析该基因家族成员的功能奠定基础。%Flavonoids are a kind of important secondary metabolites in plants. Usually, it was found in fruits, vegetables, beans, tea and many other plants as combination (flavonoid glycosides) or free states (flavonoid glyco-sides) form. It has important role in regulating plant growth and development. Chalcone synthase, the first key synthase during the process of flavonoids synthesis, plays an important role in plant growth and development. Based on the whole tomato genome sequence, we investigated gene members of the chalcone synthase family with genome database and bioinformatics analysis. We identified 8 chalcone synthase genes with protein sequence length varying

  10. Website for avian flu information and bioinformatics

    Institute of Scientific and Technical Information of China (English)

    GAO; George; Fu

    2009-01-01

    Highly pathogenic influenza A virus H5N1 has spread out worldwide and raised the public concerns. This increased the output of influenza virus sequence data as well as the research publication and other reports. In order to fight against H5N1 avian flu in a comprehensive way, we designed and started to set up the Website for Avian Flu Information (http://www.avian-flu.info) from 2004. Other than the influenza virus database available, the website is aiming to integrate diversified information for both researchers and the public. From 2004 to 2009, we collected information from all aspects, i.e. reports of outbreaks, scientific publications and editorials, policies for prevention, medicines and vaccines, clinic and diagnosis. Except for publications, all information is in Chinese. Till April 15, 2009, the cumulative news entries had been over 2000 and research papers were approaching 5000. By using the curated data from Influenza Virus Resource, we have set up an influenza virus sequence database and a bioinformatic platform, providing the basic functions for the sequence analysis of influenza virus. We will focus on the collection of experimental data and results as well as the integration of the data from the geological information system and avian influenza epidemiology.

  11. Website for avian flu information and bioinformatics

    Institute of Scientific and Technical Information of China (English)

    LIU Di; LIU Quan-He; WU Lin-Huan; LIU Bin; WU Jun; LAO Yi-Mei; LI Xiao-Jing; GAO George Fu; MA Jun-Cai

    2009-01-01

    Highly pathogenic influenza A virus H5N1 has spread out worldwide and raised the public concerns. This increased the output of influenza virus sequence data as well as the research publication and other reports. In order to fight against H5N1 avian flu in a comprehensive way, we designed and started to set up the Website for Avian Flu Information (http://www.avian-flu.info) from 2004. Other than the influenza virus database available, the website is aiming to integrate diversified information for both researchers and the public. From 2004 to 2009, we collected information from all aspects, i.e. reports of outbreaks, scientific publications and editorials, policies for prevention, medicines and vaccines, clinic and diagnosis. Except for publications, all information is in Chinese. Till April 15, 2009, the cumulative news entries had been over 2000 and research papers were approaching 5000. By using the curated data from Influenza Virus Resource, we have set up an influenza virus sequence database and a bioin-formatic platform, providing the basic functions for the sequence analysis of influenza virus. We will focus on the collection of experimental data and results as well as the integration of the data from the geological information system and avian influenza epidemiology.

  12. Identifying significant genetic regulatory networks in the prostate cancer from microarray data based on transcription factor analysis and conditional independency

    Directory of Open Access Journals (Sweden)

    Yeh Cheng-Yu

    2009-12-01

    Full Text Available Abstract Background Prostate cancer is a world wide leading cancer and it is characterized by its aggressive metastasis. According to the clinical heterogeneity, prostate cancer displays different stages and grades related to the aggressive metastasis disease. Although numerous studies used microarray analysis and traditional clustering method to identify the individual genes during the disease processes, the important gene regulations remain unclear. We present a computational method for inferring genetic regulatory networks from micorarray data automatically with transcription factor analysis and conditional independence testing to explore the potential significant gene regulatory networks that are correlated with cancer, tumor grade and stage in the prostate cancer. Results To deal with missing values in microarray data, we used a K-nearest-neighbors (KNN algorithm to determine the precise expression values. We applied web services technology to wrap the bioinformatics toolkits and databases to automatically extract the promoter regions of DNA sequences and predicted the transcription factors that regulate the gene expressions. We adopt the microarray datasets consists of 62 primary tumors, 41 normal prostate tissues from Stanford Microarray Database (SMD as a target dataset to evaluate our method. The predicted results showed that the possible biomarker genes related to cancer and denoted the androgen functions and processes may be in the development of the prostate cancer and promote the cell death in cell cycle. Our predicted results showed that sub-networks of genes SREBF1, STAT6 and PBX1 are strongly related to a high extent while ETS transcription factors ELK1, JUN and EGR2 are related to a low extent. Gene SLC22A3 may explain clinically the differentiation associated with the high grade cancer compared with low grade cancer. Enhancer of Zeste Homolg 2 (EZH2 regulated by RUNX1 and STAT3 is correlated to the pathological stage

  13. Determination of the mechanism of action of repetitive halothane exposure on rat brain tissues using a combined method of microarray gene expression profiling and bioinformatics analysis.

    Science.gov (United States)

    Wang, Jiansheng; Yang, Xiaojun; Xiao, Huan; Kong, Jianqiang; Bing, Miao

    2015-12-01

    The present study aimed to investigate the gene expression profiles of rats brain tissues treated with halothane compared with untreated controls to improve current understanding of the mechanism of action of the inhaled anesthetic. The GSE357 gene expression profile was dowloaded from the Gene Expression Omnibus database, and included six gene chips of samples repeatedly exposed to halothane and 12 gene chips of untreated controls. The differentially expressed genes (DEGs) between these two groups were identified using the Limma package in R language. Subsequently, the Database for Annotation, Visualization and Integrated Discovery was used to annotate the function of these DEGs. In addition, the most significantly upregulated gene and downregulated gene were annotated, to reveal the functional interactions with other associated genes, in FuncBase database. A total of 44 DEGs were obtained between The control and halothane exposure samples. Following Gene Ontology functional classification, these DEGs were found to be involved predominantly in the circulatory system, regulation of cell proliferation and response to endogenous stimulus and corticosteroid stimulus processes. KRT31 and HMGCS2, which were identified as the most significantly downregulated and upregulated DEGs, respectively, were associated with the lipid metabolic process and T cell activation, respectively. These results provided a basis for the development of improved inhalational anesthetics with minimal side effects and are essential for optimization of inhaled anesthetic techniques for advanced surgical procedures. PMID:26497548

  14. [Post-translational modification (PTM) bioinformatics in China: progresses and perspectives].

    Science.gov (United States)

    Zexian, Liu; Yudong, Cai; Xuejiang, Guo; Ao, Li; Tingting, Li; Jianding, Qiu; Jian, Ren; Shaoping, Shi; Jiangning, Song; Minghui, Wang; Lu, Xie; Yu, Xue; Ziding, Zhang; Xingming, Zhao

    2015-07-01

    Post-translational modifications (PTMs) are essential for regulating conformational changes, activities and functions of proteins, and are involved in almost all cellular pathways and processes. Identification of protein PTMs is the basis for understanding cellular and molecular mechanisms. In contrast with labor-intensive and time-consuming experiments, the PTM prediction using various bioinformatics approaches can provide accurate, convenient, and efficient strategies and generate valuable information for further experimental consideration. In this review, we summarize the current progresses made by Chineses bioinformaticians in the field of PTM Bioinformatics, including the design and improvement of computational algorithms for predicting PTM substrates and sites, design and maintenance of online and offline tools, establishment of PTM-related databases and resources, and bioinformatics analysis of PTM proteomics data. Through comparing similar studies in China and other countries, we demonstrate both advantages and limitations of current PTM bioinformatics as well as perspectives for future studies in China.

  15. Bioinformatics resources for cancer research with an emphasis on gene function and structure prediction tools

    Directory of Open Access Journals (Sweden)

    Daisuke Kihara

    2006-01-01

    Full Text Available The immensely popular fields of cancer research and bioinformatics overlap in many different areas, e.g. large data repositories that allow for users to analyze data from many experiments (data handling, databases, pattern mining, microarray data analysis, and interpretation of proteomics data. There are many newly available resources in these areas that may be unfamiliar to most cancer researchers wanting to incorporate bioinformatics tools and analyses into their work, and also to bioinformaticians looking for real data to develop and test algorithms. This review reveals the interdependence of cancer research and bioinformatics, and highlight the most appropriate and useful resources available to cancer researchers. These include not only public databases, but general and specific bioinformatics tools which can be useful to the cancer researcher. The primary foci are function and structure prediction tools of protein genes. The result is a useful reference to cancer researchers and bioinformaticians studying cancer alike.

  16. Fundamentals of bioinformatics and computational biology methods and exercises in matlab

    CERN Document Server

    Singh, Gautam B

    2015-01-01

    This book offers comprehensive coverage of all the core topics of bioinformatics, and includes practical examples completed using the MATLAB bioinformatics toolbox™. It is primarily intended as a textbook for engineering and computer science students attending advanced undergraduate and graduate courses in bioinformatics and computational biology. The book develops bioinformatics concepts from the ground up, starting with an introductory chapter on molecular biology and genetics. This chapter will enable physical science students to fully understand and appreciate the ultimate goals of applying the principles of information technology to challenges in biological data management, sequence analysis, and systems biology. The first part of the book also includes a survey of existing biological databases, tools that have become essential in today’s biotechnology research. The second part of the book covers methodologies for retrieving biological information, including fundamental algorithms for sequence compar...

  17. Wnt-signalling pathways and microRNAs network in carcinogenesis: experimental and bioinformatics approaches.

    Science.gov (United States)

    Onyido, Emenike K; Sweeney, Eloise; Nateri, Abdolrahman Shams

    2016-01-01

    Over the past few years, microRNAs (miRNAs) have not only emerged as integral regulators of gene expression at the post-transcriptional level but also respond to signalling molecules to affect cell function(s). miRNAs crosstalk with a variety of the key cellular signalling networks such as Wnt, transforming growth factor-β and Notch, control stem cell activity in maintaining tissue homeostasis, while if dysregulated contributes to the initiation and progression of cancer. Herein, we overview the molecular mechanism(s) underlying the crosstalk between Wnt-signalling components (canonical and non-canonical) and miRNAs, as well as changes in the miRNA/Wnt-signalling components observed in the different forms of cancer. Furthermore, the fundamental understanding of miRNA-mediated regulation of Wnt-signalling pathway and vice versa has been significantly improved by high-throughput genomics and bioinformatics technologies. Whilst, these approaches have identified a number of specific miRNA(s) that function as oncogenes or tumour suppressors, additional analyses will be necessary to fully unravel the links among conserved cellular signalling pathways and miRNAs and their potential associated components in cancer, thereby creating therapeutic avenues against tumours. Hence, we also discuss the current challenges associated with Wnt-signalling/miRNAs complex and the analysis using the biomedical experimental and bioinformatics approaches. PMID:27590724

  18. "大通"牦牛Lfcin基因克隆及生物信息学分析%Cloning and Bioinformatics Analysis of Lfcin Gene of Datong Yak

    Institute of Scientific and Technical Information of China (English)

    裴杰; 阎萍; 姬国红; 冯瑞林; 梁春年; 郭宪; 曾玉峰; 包鹏甲; 褚敏

    2009-01-01

    [Objective] This study was to clone Lfcin gene from Datong yak, so as to provide reference for applying this gene in feed industry and breeding industry. [Method] Using PCR technology, the lactoferricin(Lfcin)-encoding gene was obtained from genome of Datong yak; then it was cloned into pGEM-T easy vector, and then sequenced; the sequencing results were subsequently aligned with the sequences of dairy cow accessed in GenBank. Moreover, amino acid sequences of Lfcin gene from various species including yak, dairy cow, human and mouse were used for sequence alignment and phylogenesis analysis. [Result] The second exon of lactoferrin(LF) from Datong yak, which is 778 bp in length, was obtained, within which the coding region of Lfcin gene is 75 bp (25 amino acid residues); sequence analysis showed that there is discrepancy of eleven bases between Datong yak and dairy cow; Lfcin proteins from various species shared high homeology, of which that from Datong yak and dairy cow were completely identical; phylogenesis analysis showed that cladogram based on Lfcin was consistent with species evolutionary law. [Conclusion] This study laid a foundation for the prokaryotic or eukaryotic expression of Lfcin gene and further understanding the activity of Lfcin protein.

  19. Spermatogenesis-associated proteins at different developmental stages of buffalo testicular seminiferous tubules identified by comparative proteomic analysis.

    Science.gov (United States)

    Huang, Yu-Lin; Fu, Qiang; Pan, Hong; Chen, Fu-Mei; Zhao, Xiu-Ling; Wang, Huan-Jing; Zhang, Peng-Fei; Huang, Feng-Ling; Lu, Yang-Qing; Zhang, Ming

    2016-07-01

    The testicular seminiferous tubules contain Sertoli cells and different types of spermatogenic cells. They provide the microenvironment for spermatogenesis, but the precise molecular mechanism of spermatogenesis is still not well known. Here, we have employed tandem mass tag coupled to LC-MS/MS with the high-throughput quantitative proteomics technology to explore the protein expression from buffalo testicular seminiferous tubules at three different developmental stages (prepuberty, puberty, and postpuberty). The results show 304 differentially expressed proteins with a ≥2-fold change, and bioinformatics analysis indicates that 27 of these may be associated with spermatogenesis. Expression patterns of seven selected proteins were verified via Western blot and quantitative RT-PCR analysis, and further cellular localizations of these proteins by immunohistochemical or immunofluorescence analysis. Taken together, the results provide potential molecular markers of spermatogenesis and provide a rich resource for further studies on male reproduction regulation. PMID:27173832

  20. p3d – Python module for structural bioinformatics

    Directory of Open Access Journals (Sweden)

    Fufezan Christian

    2009-08-01

    Full Text Available Abstract Background High-throughput bioinformatic analysis tools are needed to mine the large amount of structural data via knowledge based approaches. The development of such tools requires a robust interface to access the structural data in an easy way. For this the Python scripting language is the optimal choice since its philosophy is to write an understandable source code. Results p3d is an object oriented Python module that adds a simple yet powerful interface to the Python interpreter to process and analyse three dimensional protein structure files (PDB files. p3d's strength arises from the combination of a very fast spatial access to the structural data due to the implementation of a binary space partitioning (BSP tree, b set theory and c functions that allow to combine a and b and that use human readable language in the search queries rather than complex computer language. All these factors combined facilitate the rapid development of bioinformatic tools that can perform quick and complex analyses of protein structures. Conclusion p3d is the perfect tool to quickly develop tools for structural bioinformatics using the Python scripting language.

  1. Bioinformatic Analysis and Prediction of miRNA-122a Target Genes%miRNA-122a靶基因预测及生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    蒋永容; 李增鹏; 王东; 王阁; 陈川; 张志敏; 郑继军; 许文; 罗美林; 戴楠; 李梦侠; 杨宇馨

    2011-01-01

    目的:利用基因芯片技术分析肝癌HepG2细胞和正常肝上皮LO2细胞中miRNA的表达,并对HepG2细胞中低表达的miRNA-122a进行靶基因预测及相关生物信息学分析,为以miRNA-122a为靶点的基因治疗提供理论和实验基础.方法:利用基因芯片技术检测HepG2细胞和LO2细胞中miRNA-122a表达水平,通过生物信息学预测miRNA-122a的靶基因,并对其靶基因进行功能富集分析(GO-analysis)、信号转导通路富集分析(Pathway-analysis)和蛋白质相互作用网络分析.结果:与LO2细胞比较,miRNA-122a在HepG2细胞中呈低表达.miRNA-122a预测靶基因有1 104个,其靶基因集合功能分别富集于碳水化合物生物合成、核苷酸代谢、细胞因子受体结合、细胞周期等生物学过程(P<0.001);信号转导通路显著富集于JAK-STAT信号通路、Wnt信号通路、MAPK信号通路、ErbB信号通路、细胞周期等信号转导通路(P<0.001).结论:miRNA-122a在HepG2细胞中呈现低表达,miRNA-122a预测靶基因集合显著富集在与肿瘤发生相关的信号通路中.%Objective: The present study aimed to investigate miRNA expression patterns in hepatocellular carcinoma ( HepG2 ) and normol liver epithelial (LO2) cell lines.Another aim was to bioinformatically analyze as well as predict the target genes of miR-122a to provide both theoretical and experimental basis for gene therapy.Methods: The expression levels of miRNA- 122a in HepG2 and LO2 cells were detected using the gene chip technology.The bioinformatic analysis of the target genes of miRNA-122a involved enrichment ( gene ontology ), signal transduction pathway enrichment, and protein interaction network analyses.Results: miRNA-122a expression significantly decreased in HepG2 cells, compared with LO2 cells.The number of miRNA-122a target genes was 1104.The functions of these target genes were enriched in carbohydrate biosynthesis, nucleotide metabolism, cytokine receptor binding, cell cycle, and other

  2. Cloning and Bioinformatic Analysis of PGIP Gene from Prunus caoyuan%草原樱桃PGIP基因的克隆及生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    于文全; 刘海荣; 杨晓华; 赵恒田

    2012-01-01

    为了克隆草原樱桃多聚半乳糖醛酸酶抑制蛋白基因,并进行生物信息学分析.以草原樱桃叶片基因组为模板,PGIP基因保守序列设计引物,通过PCR扩增获得约1 kb的DNA片段.序列分析表明,该基因全长1 193 bp,包含有1个完整的开放阅读框,由2个外显子和1个内含子构成,外显子总长990 bp,编码330个氨基酸,其编码的氨基酸序列中含有一段典型的亮氨酸重复序列,GenBank登录号为GU068977;该基因与中国李、杏、桃、马哈利樱桃、梅等李属植物的PGIP基因序列一致度达95%~99%.系统进化分析显示,属内亲缘关系较近、属间亲缘关系较远的特点.克隆了草原樱桃PGIP基因,为樱桃抗病育种提供一条新的基因资源.%Disease resistance mechanism was studied by methods of cloning of the PGIP gene in Prunus ca-oyuan. A DNA fragment about 1 kb was amplified from the genomic DNA of Prunus caoyuan leaves by PCR with a pair of specific primers based on the conserved sequences of the PGIP genes of genus Prunus. Sequence analysis showed that the fragment contains a full coding region of 1 193 bp(GenBank accession: GU068977 ). This sequence had a full open reading frame encoding the PGIP, and contained two exons interrupted by one intron. The total exons were comprised by 990 bp of deoxynucleotide encoding 330 amino acid. A conserved leucine-rich fragmenthad existed in the derived protein sequence. Sequencing analysis showed that it was 95% to 99% identical with the sequences of Prunus PGIP genes including P. salicina,P. armeniaca,P. persica,P. mahaleb,P. mume. Phylogenic tree showed that genetic relationship within the genus was closer and between the genera was farther. A PGIP gene of Prunus caoyuan was cloned. As a result,a gene resource was provided for molecular breeding of plants.

  3. Cloning and Bioinformatics Analysis of TSP1 and TSP6 Gene of Echinococcus granulosus%细粒棘球蚴TSP1和TSP6基因的克隆及生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    刘田莉; 孟庆玲; 乔军; 陈诚; 马玉; 胡政香; 才学鹏; 陈创夫

    2015-01-01

    In order to study the function of two important antigen genes tetraspanin 1-TSP1( TSP1 )and tet-raspanin 1-TSP6(TSP6),primers derived from Echinococcus granulosus genome database in GenBank were designed and the open reading frame( ORF)sequences of TSP1 and TSP6 were cloned by RT-PCR from hydatid protoscolex. Then they were cloned into pMD19-T vector for bioinformatics analysis. The results indicated that the TSP1 cDNA contains 792 nucleotides. The deduced protein consisted of 263 amino acids and has three N-glycosylation sites,two N-acylation sites. The gene sequence showed about 98. 99% identity with the TSP1(EG 11043)reported and the induced amino acid sequence showed about 98. 48% identity. The TSP6 cDNA contains 666 nucleotides. The de-duced protein consisted of 221 amino acids and has five N-acylation sites,one Tyrosine kinase phosphorylation sites. The gene sequence showed about 98. 18% identity with the TSP6(EG 00715)reported and the induced ami-no acid sequence showed about 85. 07% identity. The study carried out bioinformatics analysis of the TSP1 and TSP6 gene of Eg by molecular biology software to predict the structure and epitope of protein antigens known and laid a good foundation for the preparation of developing a vaccine.%为了研究绵羊细粒棘球蚴重要抗原基因 Tetraspanin 1-TSP1( TSP1)和 Tetraspanin 1-TSP6( TSP6)的功能,对GenBank中 Eg基因组数据库检索,获得 TSP1和 TSP6的 cDNA序列并设计特异性引物。以 Eg头节为总 RNA 模板,进行 RT-PCR,将 PCR产物克隆到 pMD19-T载体后测序并进行生物信息学分析。TSP1 cDNA 全长792个核苷酸,编码263个氨基酸,该多肽含有3个潜在的 N端糖基化位点,2个 N 端酰基化位点,与已登录的标准株 TSP1基因序列(EG 11043)同源性为98.99%,其推导的氨基酸序列同源性为98.48%;TSP6 cDNA 全长666个核苷酸,编码221个氨基酸,该多肽含有5个 N 端酰基化位点,1个酪氨酸激酶磷

  4. Structural Bioinformatics and Protein Docking Analysis of the Molecular Chaperone-Kinase Interactions: Towards Allosteric Inhibition of Protein Kinases by Targeting the Hsp90-Cdc37 Chaperone Machinery

    Directory of Open Access Journals (Sweden)

    Gennady Verkhivker

    2013-11-01

    Full Text Available A fundamental role of the Hsp90-Cdc37 chaperone system in mediating maturation of protein kinase clients and supporting kinase functional activity is essential for the integrity and viability of signaling pathways involved in cell cycle control and organism development. Despite significant advances in understanding structure and function of molecular chaperones, the molecular mechanisms and guiding principles of kinase recruitment to the chaperone system are lacking quantitative characterization. Structural and thermodynamic characterization of Hsp90-Cdc37 binding with protein kinase clients by modern experimental techniques is highly challenging, owing to a transient nature of chaperone-mediated interactions. In this work, we used experimentally-guided protein docking to probe the allosteric nature of the Hsp90-Cdc37 binding with the cyclin-dependent kinase 4 (Cdk4 kinase clients. The results of docking simulations suggest that the kinase recognition and recruitment to the chaperone system may be primarily determined by Cdc37 targeting of the N-terminal kinase lobe. The interactions of Hsp90 with the C-terminal kinase lobe may provide additional “molecular brakes” that can lock (or unlock kinase from the system during client loading (release stages. The results of this study support a central role of the Cdc37 chaperone in recognition and recruitment of the kinase clients. Structural analysis may have useful implications in developing strategies for allosteric inhibition of protein kinases by targeting the Hsp90-Cdc37 chaperone machinery.

  5. Comparative bioinformatics and experimental analysis of the intergenic regulatory regions of Bacillus cereus hbl and nhe enterotoxin operons and the impact of CodY on virulence heterogeneity

    Directory of Open Access Journals (Sweden)

    Maria-Elisabeth eBöhm

    2016-05-01

    Full Text Available Bacillus cereus is a food contaminant with greatly varying enteropathogenic potential. Almost all known strains harbor the genes for at least one of the three enterotoxins Nhe, Hbl and CytK. While some strains show no cytotoxicity, others have caused outbreaks, in rare cases even with lethal outcome. The reason for these differences in cytotoxicity is unknown. To gain insight into the origin of enterotoxin expression heterogeneity in different strains, the architecture and role of 5’ intergenic regions (5’IGRs upstream of the nhe and hbl operons was investigated. In silico comparison of 142 strains of all seven phylogenetic groups of B. cereus sensu lato proved the presence of long 5’IGRs upstream of the nheABC and hblCDAB operons, which harbor recognition sites for several transcriptional regulators, including the virulence regulator PlcR, redox regulators ResD and Fnr, the nutrient-sensitive regulator CodY as well as the master regulator for biofilm formation SinR. By determining transcription start sites, unusually long 5’ untranslated regions (5’UTRs upstream of the nhe and hbl start codons were identified, which are not present upstream of cytK-1 and cytK-2. Promoter fusions lacking various parts of the nhe and hbl 5’UTR in B. cereus INRA C3 showed that the entire 331 bp 5’UTR of nhe is necessary for full promoter activity, while the presence of the complete 606 bp hbl 5’UTR lowers promoter activity. Repression was caused by a 268 bp sequence directly upstream of the hbl transcription start. Luciferase activity of reporter strains containing nhe and hbl 5’IGR lux fusions provided evidence that toxin gene transcription is upregulated by the depletion of free amino acids. Electrophoretic mobility shift assays showed that the branched-chain amino acid sensing regulator CodY binds to both nhe and hbl 5’UTR downstream of the promoter, potentially acting as a nutrient-responsive roadblock repressor of toxin gene transcription

  6. Comparative Bioinformatics and Experimental Analysis of the Intergenic Regulatory Regions of Bacillus cereus hbl and nhe Enterotoxin Operons and the Impact of CodY on Virulence Heterogeneity.

    Science.gov (United States)

    Böhm, Maria-Elisabeth; Krey, Viktoria M; Jeßberger, Nadja; Frenzel, Elrike; Scherer, Siegfried

    2016-01-01

    Bacillus cereus is a food contaminant with greatly varying enteropathogenic potential. Almost all known strains harbor the genes for at least one of the three enterotoxins Nhe, Hbl, and CytK. While some strains show no cytotoxicity, others have caused outbreaks, in rare cases even with lethal outcome. The reason for these differences in cytotoxicity is unknown. To gain insight into the origin of enterotoxin expression heterogeneity in different strains, the architecture and role of 5' intergenic regions (5' IGRs) upstream of the nhe and hbl operons was investigated. In silico comparison of 142 strains of all seven phylogenetic groups of B. cereus sensu lato proved the presence of long 5' IGRs upstream of the nheABC and hblCDAB operons, which harbor recognition sites for several transcriptional regulators, including the virulence regulator PlcR, redox regulators ResD and Fnr, the nutrient-sensitive regulator CodY as well as the master regulator for biofilm formation SinR. By determining transcription start sites, unusually long 5' untranslated regions (5' UTRs) upstream of the nhe and hbl start codons were identified, which are not present upstream of cytK-1 and cytK-2. Promoter fusions lacking various parts of the nhe and hbl 5' UTR in B. cereus INRA C3 showed that the entire 331 bp 5' UTR of nhe is necessary for full promoter activity, while the presence of the complete 606 bp hbl 5' UTR lowers promoter activity. Repression was caused by a 268 bp sequence directly upstream of the hbl transcription start. Luciferase activity of reporter strains containing nhe and hbl 5' IGR lux fusions provided evidence that toxin gene transcription is upregulated by the depletion of free amino acids. Electrophoretic mobility shift assays showed that the branched-chain amino acid sensing regulator CodY binds to both nhe and hbl 5' UTR downstream of the promoter, potentially acting as a nutrient-responsive roadblock repressor of toxin gene transcription. PlcR binding sites are

  7. Comparative Bioinformatics and Experimental Analysis of the Intergenic Regulatory Regions of Bacillus cereus hbl and nhe Enterotoxin Operons and the Impact of CodY on Virulence Heterogeneity.

    Science.gov (United States)

    Böhm, Maria-Elisabeth; Krey, Viktoria M; Jeßberger, Nadja; Frenzel, Elrike; Scherer, Siegfried

    2016-01-01

    Bacillus cereus is a food contaminant with greatly varying enteropathogenic potential. Almost all known strains harbor the genes for at least one of the three enterotoxins Nhe, Hbl, and CytK. While some strains show no cytotoxicity, others have caused outbreaks, in rare cases even with lethal outcome. The reason for these differences in cytotoxicity is unknown. To gain insight into the origin of enterotoxin expression heterogeneity in different strains, the architecture and role of 5' intergenic regions (5' IGRs) upstream of the nhe and hbl operons was investigated. In silico comparison of 142 strains of all seven phylogenetic groups of B. cereus sensu lato proved the presence of long 5' IGRs upstream of the nheABC and hblCDAB operons, which harbor recognition sites for several transcriptional regulators, including the virulence regulator PlcR, redox regulators ResD and Fnr, the nutrient-sensitive regulator CodY as well as the master regulator for biofilm formation SinR. By determining transcription start sites, unusually long 5' untranslated regions (5' UTRs) upstream of the nhe and hbl start codons were identified, which are not present upstream of cytK-1 and cytK-2. Promoter fusions lacking various parts of the nhe and hbl 5' UTR in B. cereus INRA C3 showed that the entire 331 bp 5' UTR of nhe is necessary for full promoter activity, while the presence of the complete 606 bp hbl 5' UTR lowers promoter activity. Repression was caused by a 268 bp sequence directly upstream of the hbl transcription start. Luciferase activity of reporter strains containing nhe and hbl 5' IGR lux fusions provided evidence that toxin gene transcription is upregulated by the depletion of free amino acids. Electrophoretic mobility shift assays showed that the branched-chain amino acid sensing regulator CodY binds to both nhe and hbl 5' UTR downstream of the promoter, potentially acting as a nutrient-responsive roadblock repressor of toxin gene transcription. PlcR binding sites are

  8. 蔷薇科植物DELLA蛋白的生物信息学分析%Bioinformatics Analysis of DELLA Proteins in Rosaceous Plants

    Institute of Scientific and Technical Information of China (English)

    宋伟; 李鼎立; 王然; 原永兵; 刘成连; 马春晖

    2013-01-01

      为探索蔷薇科植物DELLA蛋白的结构特征和亲缘进化关系,以蔷薇科植物苹果、梨和玫瑰等19个 DELLA 蛋白为试材,利用 expasy、PSORT 和 PROSITE 数据库、TM-HMM 方法、SignalP4.0Server、CDD、DNAMAN和MEGA version5.3等软件对蛋白质进行了生物信息学分析。结果表明:蔷薇科植物中19个DELLA蛋白氨基酸序列组成成分和理化性质差异不明显,均为非跨膜类亲水性蛋白,且不含信号肽;不同DELLA蛋白之间同源性较高,达到74.38%以上,具有DELLA和GRAS 2个保守结构域,功能位点是GRAS;19个DELLA蛋白N端同源性较低,但是存在TVHYNP、VHIID和RVER等DELLA蛋白典型结构域;进化树显示梨属和蔷薇属属内植物DELLA蛋白亲缘关系较近,苹果属内植物亲缘关系相差较远。本研究的开展为蔷薇科植物遗传演化研究提供理论依据。%In order to explore the structural feature and phylogenetic analysis of DELLA protein in Rosaceous plants, 19 DELLA proteins of Malus, Pyrus and Rose in Rosaceous plants were analyzed by using expasy, PSORT and PROSITE date bank, TM-HMM, SignalP4.0Server, CDD, DNAMAN and MEGA version 5.3 softwares. The results showed that all DELLA proteins were non-transmembrane hydrophilic proteins and without the signal peptides, and the difference was insignificant among their amino acid composition, physical and chemical characteristics. There was high homology among the different DELLA proteins, reached above 74.38% . Both DELLA and GRAS had two conserved domains, and the function sites were GRAS. The homology of the N-terminal of the 19 DELLA proteins were low, however they had the DELLA protein typical domains such as TVHYNP, VHIID and RVER etc. Phylogenetic tree analysis showed DELLA proteins had the closest relationships in Pyrus and Rose plants, and the distant relationships in Malus plants. The study provided theoretical basis for the genetic evolution of Rosaceous plants.

  9. Bioinformatics analysis of NAC gene family in peach%桃NAC基因家族生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    张春华; 上官凌飞; 俞明亮; 张彦苹; 马瑞娟

    2012-01-01

    NAC基因家族是最大的植物特有的转录因子家族之一,因在植物发育和逆境应答过程中起着多样的作用而被广泛关注.为进一步进行桃NAC家族基因鉴定、功能分析等研究提供基础信息,采用生物信息学方法预测了桃NAC基因家族成员数目、在基因组骨架上分布、表达模式、假定蛋白质结构和亚族分类.预测结果显示桃NAC基因家族包含115个假定NAC蛋白质,被分为17个业族,且与拟南芥中NAC家族基因具有一定的相似性;1个NAC基因分布在11号骨架上,其余分布在1~8号基因组骨架上;对一级结构的分析结果显示桃NAC家族蛋白质分子量和氨基酸数目成正相关,绝大多数是亲水氨基酸,各亚族间等电点没有规律;115个蛋白质的二级结构全部以无规则卷曲为主要构成元件,且它们的三级结构大部分相似.在果皮中表达的NAC家族基因数最多,达到75%;在花芽中表达的NAC家族基因数较少,为1%.%The NAC family genes constitute one of the largest families of plant-specific transcription factors and are known to possess diverse roles in plant development and in the recognition of environmental stimuli. In order to offer basic information for further studies on the identification and function analysis of NAC family genes in peach, the number of members in NAC gene family, the distribution on the scaffold, the structure of protein, their expression pattern, as well as phy-logeny classification were predicted. The results showed that NAC gene family contained 115 predicted proteins in peach and were clustered into seventeen subfamilies. This indicated there was certain similarity in NAC genes between Primus per-sica and Arabidopsis thaliana. The results of scaffold distribution revealed that one NAC gene located on the number 11 scaffold , and the others located on scaffolds 1 to 8 of peach genome. The physico-chemical analysis revealed that their molecular weight had a

  10. Identifying At-Risk Students in General Chemistry via Cluster Analysis of Affective Characteristics

    Science.gov (United States)

    Chan, Julia Y. K.; Bauer, Christopher F.

    2014-01-01

    The purpose of this study is to identify academically at-risk students in first-semester general chemistry using affective characteristics via cluster analysis. Through the clustering of six preselected affective variables, three distinct affective groups were identified: low (at-risk), medium, and high. Students in the low affective group…

  11. 水稻PIN家族的生物信息学分析%Bioinformatics Analysis of PIN-formed Family in Oryza sativa

    Institute of Scientific and Technical Information of China (English)

    丁懿; 石彩娟; 王万军

    2012-01-01

    PIN family is the most important auxin efflux carrier in plants, which encompasses lots of members in nearly all the higher plants. It turns out that there are as many as 12 PIN genes in Oryza sativa when we perform BLAST search against its genome. Our studies show that PIN genes in Oryza saliva are asymmetrically distributed; The exon-inlron structure of the PIN genes are much the same in Oryza saliva; PIN proteins have all the necessary elements for acting as carrier proteins; the hydrophobicity profile, trans membrane domain; Motif analysis suggests that there are NPNXY motifs, a well characterized IM motif, and 7 other uncharacterized motif in almost all the PIN proteins; Neigh-bour-joing phylogenelic trees show that PINs family split into two major families in early stages, which can be distinguished by the length of hydrogen loop in the center of proteins, and later lineage-specific duplication appeared in Oryza saliva, resulting a bunch of paralogous in the genome of Oryza sativa.%作为植物中最重要的生长素外输载体,PIN家族在各种植物中都拥有众多成员.对水稻基因组的BLAST搜索获得了12个PIN家族成员,分析发现:水稻PIN基因不均衡地分布在基因组染色体上;其内含子、外显子结构类似;PIN蛋白质拥有典型的载体蛋白序列特征,即亲/疏水性反复变化,存在多次跨膜结构域;几乎所有PIN蛋白都有一个NPNXY的内化结构域和7个功能未知的基序;PIN在早期即分化为两组成员,主要区别在于中间亲水环的部分缺失;并且在后来发生了支系特异的复制事件,从而形成了众多旁系同源基因.

  12. Bioinformatic Analysis of UXS Gene Family in Arabidopsis and Rice%拟南芥和水稻UXS基因家族生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    潘玉欣; 王巍杰; 胡金山

    2011-01-01

    In this study, six UDP-glucuronate decarboxylase (UXS) genes separately from Arabi-dopsis and .rice were analyzed from the gene structure, conservative motif, gene expression and phylogenesis. The results showed that 12 UXS genes had introns. All the UXS genes expressed in roots,leaves and calluses,except for OsUXSl expressing only in calluses. The 12 UXS proteins were hydrophilic proteins with a wide range of hydrophilic areas and had highly conserved structure with the family domain 3Beta-HSD and NAD-binding. The 12 UXS proteins belonged to two sub-families and their structures and functions were similar in a sub-family. The comprehensive a-nalysis revealed that as a multi-gene family,the UXS gene expressed widely and had conservative stucture.%以6个拟南芥和6个水稻UXS基因家族序列为目标,对其基因结构、保守结构域、基因表达、系统进化等方面进行了综合分析.结果显示,12个UXS基因均有内含子,除OsUXS1基因仅在愈伤组织中表达外,其余11个基因在根、叶以及愈伤组织均有表达.12个UXS基因编码蛋白均存在较大范围的亲水区,有较强的亲水性.12个蛋白结构保守性较强,含有该基因家族的保守域3Beta- HSD和NAD - binding,分成2个亚家族,家族内结构相似的基因功能较为相似.综合分析表明,UXS是一个多基因家族,基因表达范围广,结构保守性强.

  13. Provenance of e-Science Experiments - experience from Bioinformatics

    OpenAIRE

    Greenwood, M.; Goble, C.A.; Stevens, R. D.; Zhao, J.(Central China Normal University (HZNU), Wuhan, 430079, China); Addis, M; Marvin, D; Moreau, L; Oinn, T.

    2003-01-01

    Like experiments performed at a laboratory bench, the data associated with an e-Science experiment are of reduced value if other scientists are not able to identify the origin, or provenance, of those data. Provenance information is essential if experiments are to be validated and verified by others, or even by those who originally performed them. In this article, we give an overview of our initial work on the provenance of bioinformatics e-Science experiments within myGrid. We use two kinds ...

  14. Macrobrachium rosenbergii mannose binding lectin: synthesis of MrMBL-N20 and MrMBL-C16 peptides and their antimicrobial characterization, bioinformatics and relative gene expression analysis.

    Science.gov (United States)

    Arockiaraj, Jesu; Chaurasia, Mukesh Kumar; Kumaresan, Venkatesh; Palanisamy, Rajesh; Harikrishnan, Ramasamy; Pasupuleti, Mukesh; Kasi, Marimuthu

    2015-04-01

    Mannose-binding lectin (MBL), an antimicrobial protein, is an important component of innate immune system which recognizes repetitive sugar groups on the surface of bacteria and viruses leading to activation of the complement system. In this study, we reported a complete molecular characterization of cDNA encoded for MBL from freshwater prawn Macrobrachium rosenbergii (Mr). Two short peptides (MrMBL-N20: (20)AWNTYDYMKREHSLVKPYQG(39) and MrMBL-C16: (307)GGLFYVKHKEQQRKRF(322)) were synthesized from the MrMBL polypeptide. The purity of the MrMBL-N20 (89%) and MrMBL-C16 (93%) peptides were confirmed by MS analysis (MALDI-ToF). The purified peptides were used for further antimicrobial characterization including minimum inhibitory concentration (MIC) assay, kinetics of bactericidal efficiency and analysis of hemolytic capacity. The peptides exhibited antimicrobial activity towards all the Gram-negative bacteria taken for analysis, whereas they showed the activity towards only a few selected Gram-positive bacteria. MrMBL-C16 peptides produced the highest inhibition towards both the Gram-negative and Gram-positive bacteria compared to the MrMBL-N20. Both peptides do not produce any inhibition against Bacillus sps. The kinetics of bactericidal efficiency showed that the peptides drastically reduced the number of surviving bacterial colonies after 24 h incubation. The results of hemolytic activity showed that both peptides produced strong activity at higher concentration. However, MrMBL-C16 peptide produced the highest activity compared to the MrMBL-N20 peptide. Overall, the results indicated that the peptides can be used as bactericidal agents. The MrMBL protein sequence was characterized using various bioinformatics tools including phylogenetic analysis and structure prediction. We also reported the MrMBL gene expression pattern upon viral and bacterial infection in M. rosenbergii gills. It could be concluded that the prawn MBL may be one of the important molecule which

  15. Bioinformatics analyses of Shigella CRISPR structure and spacer classification.

    Science.gov (United States)

    Wang, Pengfei; Zhang, Bing; Duan, Guangcai; Wang, Yingfang; Hong, Lijuan; Wang, Linlin; Guo, Xiangjiao; Xi, Yuanlin; Yang, Haiyan

    2016-03-01

    Clustered regularly interspaced short palindromic repeats (CRISPR) are inheritable genetic elements of a variety of archaea and bacteria and indicative of the bacterial ecological adaptation, conferring acquired immunity against invading foreign nucleic acids. Shigella is an important pathogen for anthroponosis. This study aimed to analyze the features of Shigella CRISPR structure and classify the spacers through bioinformatics approach. Among 107 Shigella, 434 CRISPR structure loci were identified with two to seven loci in different strains. CRISPR-Q1, CRISPR-Q4 and CRISPR-Q5 were widely distributed in Shigella strains. Comparison of the first and last repeats of CRISPR1, CRISPR2 and CRISPR3 revealed several base variants and different stem-loop structures. A total of 259 cas genes were found among these 107 Shigella strains. The cas gene deletions were discovered in 88 strains. However, there is one strain that does not contain cas gene. Intact clusters of cas genes were found in 19 strains. From comprehensive analysis of sequence signature and BLAST and CRISPRTarget score, the 708 spacers were classified into three subtypes: Type I, Type II and Type III. Of them, Type I spacer referred to those linked with one gene segment, Type II spacer linked with two or more different gene segments, and Type III spacer undefined. This study examined the diversity of CRISPR/cas system in Shigella strains, demonstrated the main features of CRISPR structure and spacer classification, which provided critical information for elucidation of the mechanisms of spacer formation and exploration of the role the spacers play in the function of the CRISPR/cas system.

  16. Bioinformatics Analysis on the Coding Region of BPI Gene among Thirteen Different Species%13个物种BPI基因编码区生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    吴正常; 苏先敏; 王瑾; 郑先瑞; 吴圣龙; 包文斌

    2013-01-01

    本研究采用生物信息学的方法比较不同物种BPI基因编码区CDS序列,分析人、小家鼠、褐家鼠、牛、白颊长臂猿、原鸡、家兔、野猪、猕猴、家马、非洲爪蟾、毛猩猩、犬13个物种BPI基因编码区的遗传多样性,且对BPI氨基酸序列组成、信号肽、疏水性/亲水性、跨膜结构、二级结构及保守结构域进行预测分析.结果表明,在13个物种的29条BPI基因CDS序列中,共检测到532个多态位点,生成了15种单倍型,BPI基因在种群间及种群内均存在较大的遗传变异,BPI基因具有较强的密码子偏爱性.BPI蛋白理论等电点均大于7,呈碱性,N端大都有信号肽,肽链表现为亲水性,基本属于跨膜蛋白和分泌蛋白.BPI蛋白主要二级结构元件为a螺旋、β折叠和无规则卷曲,有2个保守结构域BPI1和BPI2.%The research was aimed to carry out the bioinformatics analysis on the coding region (CDS) of BPI gene among different species. By using the bioinformatics method, the genetic diversities of coding regions of BPI gene from the Homo sapiens ,Mus musculus , Rattus norvegicus , Bos Taurus, Nomascus leucogenys , Gallus gallus , Oryctolagus cuniculus , Sus scrofa , Macaca mulatta , Equus caballus , Xenopus laevis , Pongo abelii and Canine were analyzed. The amino acid sequence, composition, signal peptide, hydrophobicity/hydrophilicity, trans-membrane structural domain, secondary structure and conservative structural domain of BPI protein were predicted and deduced. In 29 CDS sequences of BPI gene from 13 species, 532 polymorphic sites were detected and 15 haplotypes were generated. The coding region of BPI gene had the rich genetic diversity within and among species. BPI gene had strong codon bias. The theoretical isoelectric point of BPI protein was higher than 7. The protein was alkaline,and N terminal almost had the signal peptide. The peptide chain presented the hydrophilicity. There was a trans-membrane structural domain

  17. Assessing Reliability of Cellulose Hydrolysis Models to Support Biofuel Process Design – Identifiability and Uncertainty Analysis

    DEFF Research Database (Denmark)

    Sin, Gürkan; Meyer, Anne S.; Gernaey, Krist

    2010-01-01

    The reliability of cellulose hydrolysis models is studied using the NREL model. An identifiability analysis revealed that only 6 out of 26 parameters are identifiable from the available data (typical hydrolysis experiments). Attempting to identify a higher number of parameters (as done in the ori...... to analyze the uncertainty of model predictions. This allows judging the fitness of the model to the purpose under uncertainty. Hence we recommend uncertainty analysis as a proactive solution when faced with model uncertainty, which is the case for biofuel process development research....

  18. Bioinformatics analysis and clone of hERGIC3 gene related with newly-diagnosed lung cancer%新肺癌相关基因hERGIC3的克隆与生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    耿娜娜; 吴明松; 郑翔; 刘兴宇; 李学英

    2014-01-01

    目的:构建hERGIC3基因的原核表达载体,并对其进行生物信息学分析,以期全面地了解hERGIC3蛋白的生物学功能。方法采用实时半定量聚合酶链反应(RT-PCR)方法获取人类hERGIC3基因的开放阅读框(ORF)框DNA序列插入pGEM-T Easy载体;选择多个软件对hERGIC3蛋白进行生物信息学分析。结果酶切结果显示,插入序列为目标序列,成功克隆了hERGIC3基因;生物信息学分析结果显示:hERGIC3蛋白由383个氨基酸残基组成,相对分子质量为43.2×103,理论等电点为5.68,蛋白比较稳定;hERGIC3为跨膜蛋白,跨膜区为20~42和345~367,膜外区域为1~19和368~383,膜内区域为43~344;二级结构含有110个α螺旋、94条延伸链、184个随机卷曲、15个潜在的磷酸化位点, hERGIC3可能参与了氨基酸、辅酶的生物合成以及脂肪、能量代谢和蛋白质翻译、转运等功能;hERGIC3含有ERGIC_N和COPⅡcoated_ERV 2个蛋白保守结构域,hERGIC3可能与PPKCSH、ERGIC1、ERGIC2、COPA、PSMD11等蛋白有相互作用。结论成功克隆了hERGIC3基因;深入地分析了hERGIC3的结构与功能,为下一步研究hERGIC3基因在肺癌中的病理生理功能奠定了理论基础。%Objective To construct prokaryotic expression vector of hERGIC3 gene and analyze it by bioinformatics ,so as to fully understand the biological function of hERGIC3 protein. Methods The pGEM-T Easy vector was inserted into the DNA fragment of open reading frame(ORF) sequence of gene hERGIC3 by real time-polymerase chain reaction(RT-PCR) method;mul-tiple softwares were selected to analyze the hERGIC3 protein by bioinformatics. Results The enzyme digestion showed that the insertion sequence as the targeted sequence cloned the hERGIC3 gene successfully;the bioinformatics analysis showed that hER-GIC3 protein was consisted of 383 amino acid residues with the relative molecular mass of 43.2 ×103 and theoretical

  19. Expression and Bioinformatic Analysis of Ornithine Aminotransferase in Non-small Cell Lung Cancer%OAT在非小细胞肺癌中的表达及生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    周丹菲; 程西安; 杨拴盈; 明宗娟; 李维; 张秋红; 张玉萍

    2012-01-01

    Background and objective It has been proven that ornithine aminotransferase (OAT) might play an important role in the oncogenesis and progression of numerous malignant tumors. The aim of this study is to detect the mRNA and protein expression of OAT in non-small cell lung cancer (NSCLC), as well as to analyze the bioinformatic features and binary interactions. Methods OAT mRNA expression was detected in A549 and 16HBE cell lines by reverse transcription-polymerase chain reaction. OAT protein expression was determined in 55 cases of NSCLC and 17 cases of adjacent non-tumor lung tissues by immunohistochemical staining. The bioinformatic features and binary interactions of OAT were analyzed. Gene ontology annotation and signal pathway analysis were performed. Results OAT mRNA expression in A549 cells was 2.85-fold lower than that in 16HBE cells. OAT protein expression was significantly higher in NSCLC tissues than that in adjacent non-tumor lung tissues. A significant difference of OAT protein expression was existed between squamous cell lung cancer and adenocarcinoma (P<0.05), but was not correlated with the gender, age, lymph node metastasis, tumor size, and TNM stages. Bioinformatic analysis suggested that OAT was a highly homologous and stable protein located in the mitochondria. An aminotran-3 domain and several sites of phosphorylation, which may function in signal transduction, gene transcription, and molecular transit, were found. In the 54 selected binary interactions of OAT, TNF and TRAF6 play roles in the NF-kB pathway. Conclusion OAT may play an important role in the oncogenesis and progression of NSCLC. Thus, OAT may be a novel biomarker for the diagnosis of NSCLC or a new target for its treatment.%背景与目的 已有的研究表明,鸟氨酸氨基转移酶( ornithine aminotransferase,OAT)可能参与多种恶性肿瘤的发生和发展,本研究旨在检测非小细胞肺癌( non-small cell lung cancer,NSCLC) 中OAT mRNA和蛋白质的表达,并对

  20. A Numerical Procedure for Model Identifiability Analysis Applied to Enzyme Kinetics

    DEFF Research Database (Denmark)

    Daele, Timothy, Van; Van Hoey, Stijn; Gernaey, Krist;

    2015-01-01

    structure evaluation by assessing the local identifiability characteristics of the parameters. Moreover, such a procedure should be generic to make sure it can be applied independent from the structure of the model. We hereby apply a numerical identifiability approach which is based on the work of Walter...... and Pronzato (1997) and which can be easily set up for any type of model. In this paper the proposed approach is applied to the forward reaction rate of the enzyme kinetics proposed by Shin and Kim(1998). Structural identifiability analysis showed that no local structural model problems were occurring....... In contrast, the practical identifiability analysis revealed that high values of the forward rate parameter Vf led to identifiability problems. These problems were even more pronounced athigher substrate concentrations, which illustrates the importance of a proper experimental designto avoid...

  1. Bioinformatics Analysis of Non-structural Protein 2 of PRRSV%猪繁殖与呼吸综合征病毒非结构蛋白2的生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    孙荡; 高歌; 周胜; 鲍梦雅; 茅翔

    2012-01-01

    [ Objective ] The research aimed to provide theoretical basis for constructing the antagonistic small-peptide and designing the related vaccine in clinic. [ Method] The bioinformaticg analysis was made on amino acid sequences of non-structural protein 2( Nsp 2) of PRRSV strain isolated in China on NCBI website. And its composition, physical and chemical characters, transmembrane domain, secondary structure, glycosylalion site, phosphorylation site and B cell epitope were predicted. [ Result] Nsp2 contained 7 transmembrane regions, 4 glycosy-lation sites and 76 phosphorylation sites. The content of random coil in the secondary structure was the highest (57.64% ). The comprehensive analysis showed that Nsp2 had a great lot of epitopee. [ Conclusion] The bioinformatic analysis of Nsp2 could provide theoretical basis for the vaccine design of PRRSV.%[目的]为临床上构建PRRSV的拮抗性小肽及相关疫苗设计提供理论参考.[方法]对NCBI上国内分离的PRRSV毒株(FJ 175688.1) Nsp2的氨基酸序列进行生物信息学方法分析,并对其组分、理化性质、跨膜结构域、二级结构、糖基化位点、磷酸化位点和B细胞抗原表位进行预测.[结果]Nsp2含有7段的跨膜区域,二级结构中自由卷曲含量最高(57.64%),同时含有4个糖基化位点和76个磷酸化位点.综合分析表明,Nsp2具有大量的抗原决定簇.[结论]通过对Nsp2的生物信息学分析,可为PRRSV疫苗设计提供理论基础.

  2. Consolidating metabolite identifiers to enable contextual and multi-platform metabolomics data analysis

    Directory of Open Access Journals (Sweden)

    Saito Kazuki

    2010-04-01

    Full Text Available Abstract Background Analysis of data from high-throughput experiments depends on the availability of well-structured data that describe the assayed biomolecules. Procedures for obtaining and organizing such meta-data on genes, transcripts and proteins have been streamlined in many data analysis packages, but are still lacking for metabolites. Chemical identifiers are notoriously incoherent, encompassing a wide range of different referencing schemes with varying scope and coverage. Online chemical databases use multiple types of identifiers in parallel but lack a common primary key for reliable database consolidation. Connecting identifiers of analytes found in experimental data with the identifiers of their parent metabolites in public databases can therefore be very laborious. Results Here we present a strategy and a software tool for integrating metabolite identifiers from local reference libraries and public databases that do not depend on a single common primary identifier. The program constructs groups of interconnected identifiers of analytes and metabolites to obtain a local metabolite-centric SQLite database. The created database can be used to map in-house identifiers and synonyms to external resources such as the KEGG database. New identifiers can be imported and directly integrated with existing data. Queries can be performed in a flexible way, both from the command line and from the statistical programming environment R, to obtain data set tailored identifier mappings. Conclusions Efficient cross-referencing of metabolite identifiers is a key technology for metabolomics data analysis. We provide a practical and flexible solution to this task and an open-source program, the metabolite masking tool (MetMask, available at http://metmask.sourceforge.net, that implements our ideas.

  3. Evolution of web services in bioinformatics

    NARCIS (Netherlands)

    Neerincx, P.B.T.; Leunissen, J.A.M.

    2005-01-01

    Bioinformaticians have developed large collections of tools to make sense of the rapidly growing pool of molecular biological data. Biological systems tend to be complex and in order to understand them, it is often necessary to link many data sets and use more than one tool. Therefore, bioinformatic

  4. A bioinformatics approach to marker development

    NARCIS (Netherlands)

    Tang, J.

    2008-01-01

    The thesis focuses on two bioinformatics research topics: the development of tools for an efficient and reliable identification of single nucleotides polymorphisms (SNPs) and polymorphic simple sequence repeats (SSRs) from expressed sequence tags (ESTs) (Chapter 2, 3 and 4), and the subsequent imple

  5. "Extreme Programming" in a Bioinformatics Class

    Science.gov (United States)

    Kelley, Scott; Alger, Christianna; Deutschman, Douglas

    2009-01-01

    The importance of Bioinformatics tools and methodology in modern biological research underscores the need for robust and effective courses at the college level. This paper describes such a course designed on the principles of cooperative learning based on a computer software industry production model called "Extreme Programming" (EP). The…

  6. Hardware Acceleration of Bioinformatics Sequence Alignment Applications

    NARCIS (Netherlands)

    Hasan, L.

    2011-01-01

    Biological sequence alignment is an important and challenging task in bioinformatics. Alignment may be defined as an arrangement of two or more DNA or protein sequences to highlight the regions of their similarity. Sequence alignment is used to infer the evolutionary relationship between a set of pr

  7. Bioboxes: standardised containers for interchangeable bioinformatics software.

    Science.gov (United States)

    Belmann, Peter; Dröge, Johannes; Bremges, Andreas; McHardy, Alice C; Sczyrba, Alexander; Barton, Michael D

    2015-01-01

    Software is now both central and essential to modern biology, yet lack of availability, difficult installations, and complex user interfaces make software hard to obtain and use. Containerisation, as exemplified by the Docker platform, has the potential to solve the problems associated with sharing software. We propose bioboxes: containers with standardised interfaces to make bioinformatics software interchangeable. PMID:26473029

  8. Bioinformatics: A History of Evolution "In Silico"

    Science.gov (United States)

    Ondrej, Vladan; Dvorak, Petr

    2012-01-01

    Bioinformatics, biological databases, and the worldwide use of computers have accelerated biological research in many fields, such as evolutionary biology. Here, we describe a primer of nucleotide sequence management and the construction of a phylogenetic tree with two examples; the two selected are from completely different groups of organisms:…

  9. Bioinformatics in Undergraduate Education: Practical Examples

    Science.gov (United States)

    Boyle, John A.

    2004-01-01

    Bioinformatics has emerged as an important research tool in recent years. The ability to mine large databases for relevant information has become increasingly central to many different aspects of biochemistry and molecular biology. It is important that undergraduates be introduced to the available information and methodologies. We present a…

  10. Implementing bioinformatic workflows within the bioextract server

    Science.gov (United States)

    Computational workflows in bioinformatics are becoming increasingly important in the achievement of scientific advances. These workflows typically require the integrated use of multiple, distributed data sources and analytic tools. The BioExtract Server (http://bioextract.org) is a distributed servi...

  11. Proteomic Analysis to Identify Tightly-Bound Cell Wall Protein in Rice Calli

    OpenAIRE

    Cho, Won Kyong; Hyun, Tae Kyung; Kumar, Dhinesh; Rim, Yeonggil; Chen, Xiong Yan; Jo, Yeonhwa; Kim, Suwha; Lee, Keun Woo; Park, Zee-Yong; Lucas, William J.; Kim, Jae-Yean

    2015-01-01

    Rice is a model plant widely used for basic and applied research programs. Plant cell wall proteins play key roles in a broad range of biological processes. However, presently, knowledge on the rice cell wall proteome is rudimentary in nature. In the present study, the tightly-bound cell wall proteome of rice callus cultured cells using sequential extraction protocols was developed using mass spectrometry and bioinformatics methods, leading to the identification of 1568 candidate proteins. Ba...

  12. Biochemical, Transcriptional, and Bioinformatic Analysis of Lipid Droplets from Seeds of Date Palm (Phoenix dactylifera L.) and Their Use as Potent Sequestration Agents against the Toxic Pollutant, 2,3,7,8-Tetrachlorinated Dibenzo-p-Dioxin.

    Science.gov (United States)

    Hanano, Abdulsamie; Almousally, Ibrahem; Shaban, Mouhnad; Rahman, Farzana; Blee, Elizabeth; Murphy, Denis J

    2016-01-01

    Contamination of aquatic environments with dioxins, the most toxic group of persistent organic pollutants (POPs), is a major ecological issue. Dioxins are highly lipophilic and bioaccumulate in fatty tissues of marine organisms used for seafood where they constitute a potential risk for human health. Lipid droplets (LDs) purified from date palm, Phoenix dactylifera, seeds were characterized and their capacity to extract dioxins from aquatic systems was assessed. The bioaffinity of date palm LDs toward 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD), the most toxic congener of dioxins was determined. Fractioned LDs were spheroidal with mean diameters of 2.5 µm, enclosing an oil-rich core of 392.5 mg mL(-1). Isolated LDs did not aggregate and/or coalesce unless placed in acidic media and were strongly associated with three major groups of polypeptides of relative mass 32-37, 20-24, and 16-18 kDa. These masses correspond to the LD-associated proteins, oleosins, caleosins, and steroleosins, respectively. Efficient partitioning of TCDD into LDs occurred with a coefficient of log K LB/w,TCDD = 7.528 ± 0.024; it was optimal at neutral pH and was dependent on the presence of the oil-rich core, but was independent of the presence of LD-associated proteins. Bioinformatic analysis of the date palm genome revealed nine oleosin-like, five caleosin-like, and five steroleosin-like sequences, with predicted structures having putative lipid-binding domains that match their LD stabilizing roles and use as bio-based encapsulation systems. Transcriptomic analysis of date palm seedlings exposed to TCDD showed strong up-regulation of several caleosin and steroleosin genes, consistent with increased LD formation. The results suggest that the plant LDs could be used in ecological remediation strategies to remove POPs from aquatic environments. Recent reports suggest that several fungal and algal species also use LDs to sequester both external and internally derived hydrophobic toxins, which

  13. Biochemical, Transcriptional, and Bioinformatic Analysis of Lipid Droplets from Seeds of Date Palm (Phoenix dactylifera L.) and Their Use as Potent Sequestration Agents against the Toxic Pollutant, 2,3,7,8-Tetrachlorinated Dibenzo-p-Dioxin.

    Science.gov (United States)

    Hanano, Abdulsamie; Almousally, Ibrahem; Shaban, Mouhnad; Rahman, Farzana; Blee, Elizabeth; Murphy, Denis J

    2016-01-01

    Contamination of aquatic environments with dioxins, the most toxic group of persistent organic pollutants (POPs), is a major ecological issue. Dioxins are highly lipophilic and bioaccumulate in fatty tissues of marine organisms used for seafood where they constitute a potential risk for human health. Lipid droplets (LDs) purified from date palm, Phoenix dactylifera, seeds were characterized and their capacity to extract dioxins from aquatic systems was assessed. The bioaffinity of date palm LDs toward 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD), the most toxic congener of dioxins was determined. Fractioned LDs were spheroidal with mean diameters of 2.5 µm, enclosing an oil-rich core of 392.5 mg mL(-1). Isolated LDs did not aggregate and/or coalesce unless placed in acidic media and were strongly associated with three major groups of polypeptides of relative mass 32-37, 20-24, and 16-18 kDa. These masses correspond to the LD-associated proteins, oleosins, caleosins, and steroleosins, respectively. Efficient partitioning of TCDD into LDs occurred with a coefficient of log K LB/w,TCDD = 7.528 ± 0.024; it was optimal at neutral pH and was dependent on the presence of the oil-rich core, but was independent of the presence of LD-associated proteins. Bioinformatic analysis of the date palm genome revealed nine oleosin-like, five caleosin-like, and five steroleosin-like sequences, with predicted structures having putative lipid-binding domains that match their LD stabilizing roles and use as bio-based encapsulation systems. Transcriptomic analysis of date palm seedlings exposed to TCDD showed strong up-regulation of several caleosin and steroleosin genes, consistent with increased LD formation. The results suggest that the plant LDs could be used in ecological remediation strategies to remove POPs from aquatic environments. Recent reports suggest that several fungal and algal species also use LDs to sequester both external and internally derived hydrophobic toxins, which

  14. Biochemical, Transcriptional, and Bioinformatic Analysis of Lipid Droplets from Seeds of Date Palm (Phoenix dactylifera L.) and Their Use as Potent Sequestration Agents against the Toxic Pollutant, 2,3,7,8-Tetrachlorinated Dibenzo-p-Dioxin

    Science.gov (United States)

    Hanano, Abdulsamie; Almousally, Ibrahem; Shaban, Mouhnad; Rahman, Farzana; Blee, Elizabeth; Murphy, Denis J.

    2016-01-01

    Contamination of aquatic environments with dioxins, the most toxic group of persistent organic pollutants (POPs), is a major ecological issue. Dioxins are highly lipophilic and bioaccumulate in fatty tissues of marine organisms used for seafood where they constitute a potential risk for human health. Lipid droplets (LDs) purified from date palm, Phoenix dactylifera, seeds were characterized and their capacity to extract dioxins from aquatic systems was assessed. The bioaffinity of date palm LDs toward 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD), the most toxic congener of dioxins was determined. Fractioned LDs were spheroidal with mean diameters of 2.5 µm, enclosing an oil-rich core of 392.5 mg mL-1. Isolated LDs did not aggregate and/or coalesce unless placed in acidic media and were strongly associated with three major groups of polypeptides of relative mass 32–37, 20–24, and 16–18 kDa. These masses correspond to the LD-associated proteins, oleosins, caleosins, and steroleosins, respectively. Efficient partitioning of TCDD into LDs occurred with a coefficient of log KLB/w,TCDD = 7.528 ± 0.024; it was optimal at neutral pH and was dependent on the presence of the oil-rich core, but was independent of the presence of LD-associated proteins. Bioinformatic analysis of the date palm genome revealed nine oleosin-like, five caleosin-like, and five steroleosin-like sequences, with predicted structures having putative lipid-binding domains that match their LD stabilizing roles and use as bio-based encapsulation systems. Transcriptomic analysis of date palm seedlings exposed to TCDD showed strong up-regulation of several caleosin and steroleosin genes, consistent with increased LD formation. The results suggest that the plant LDs could be used in ecological remediation strategies to remove POPs from aquatic environments. Recent reports suggest that several fungal and algal species also use LDs to sequester both external and internally derived hydrophobic toxins

  15. 中国蛇岛蝮蛇毒腺cDNA文库ESTs序列测定及生物信息学分析%Sequence Determination and Bioinformatics Analysis of ESTs from Chinese Gloydius shedaoensis shedaoensis Venom Gland

    Institute of Scientific and Technical Information of China (English)

    郭春梅; 孙明忠; 郑体花; 任一鑫; 刘淑清

    2012-01-01

    前期我们构建了中国蛇岛蝮蛇(Gloydius shedaoensis shedaoensis,GSS)毒腺(GSSG)的cDNA(GSSG-cDNA)文库.本文从构建的GSSG-cDNA文库阳性重组子中随机挑选了216个单克隆进行5'端表达序列标签(EST)单向测序,获得了211条高质量的ESTs.生物信息学序列比对分析结果表明84个克隆为已知功能基因,29个克隆为未知功能基因,98个克隆为新基因,分别占总ESTs的39.8%、13.7%和46.5%.成功获得了GSSG的部分ESTs序列,为GSS蛋白活性组分基因的克隆、表达和功能研究奠定了一定基础.%Previously, we have successfully constructed a cDNA library of Chinese Gloydius shedaoensis shedaoensis (GSS) venom gland (GSSG). In current work,a total of 216 GSSG-cDNAs were randomly picked up and analyzed by single-pass sequencing from the 5' end. A total of 211 ESTs in high quality were generated and sequenced. Bioinformatics sequencing blasting results indicated that 84 ESTs could be annotated as the genes with known function,29 ESTs as similar genes with unknown function,and the rest of 98 ESTs were identified as novel genes,which account 39. 8% ,13.7% and 46. 5% of 211 obtained ESTs,respectively. Taken together,the partial ESTs of GSSG were obtained in current work, which provides certain useful information for cloning and expressing target protein genes and studying the biological functions of target proteins from GSS.

  16. 番茄ARF2蛋白的生物信息学分析与亚细胞定位%Bioinformatic Analysis and Subcellular Localization of Solanum lycopersicum ARF2

    Institute of Scientific and Technical Information of China (English)

    冯媛媛; 侯佩; 李颖楠; 刘永胜

    2012-01-01

    克隆番茄(Solanum lycopersicum)ARF2基因,并分析其分子特性和亚细胞定位,为研究其功能提供基础.通过生物信息学方法分析SlARF2基因编码蛋白的理化性质和分子特性.采用RT-PCR技术从番茄果实cDNA中扩增SIARF2基因全长,并构建与黄色荧光蛋白(YFP)融合的pBA-ARF2-YFP表达载体,进而再通过农杆菌介导的遗传转化方法,将重组质粒转化到野生型番茄中,将得到的T1代转基因种子萌发,然后取根尖通过荧光显微镜观察了融合蛋白在活细胞内分布的特点.生物信息学分析结果表明,S1ARF2是富含Ser、Leu、Gly和Pro以及具有ARF家族典型结构域的可溶性蛋白,其氨基酸序列与葡萄、木薯和拟南芥的同源性分别为70.08%、66.94%和60.87%.经酶切和测序分析证实pBA-ARF2-YFP融合表达载体构建成功,此外,PCR分析表明融合蛋白在转基因植株中得到表达.经荧光显微镜观察,ARF2定位在细胞核中.表明转录因子S1ARF2定位在细胞核中,对番茄果实发育和成熟起重要作用.%Auxin response factors (ARFs) are important transcription factors involved in auxin signal transduction pathway. In order to elucidate the function of tomato ARF2, we isolated the SIARF2 gene and analyzed its molecular features, in addition, we observed the subcellular localization of ARF2 in transgenic tomato plants. Physicochemical properties and molecular features of ARF2 were predicted by bioinformatic approaches including physical and chemical properties analysis, hydrophobicity analysis, domain analysis, phylogenetic tree analysis and subcellular localization analysis. Moreover, the full-length of SLARF2 gene was amplified by RT-PCR, and a binary vector consisting of ARF2 fused with the yellow fluorescent protein (YFP) coding sequence was further constructed. Using the method of Agrobacterium-mediated transformation, the recombinant vector was transformed into wild-type tomato, and the transgenic tomato

  17. Comparative QTL mapping of resistance to sugarcane mosaic virus in maize based on bioinformatics

    Institute of Scientific and Technical Information of China (English)

    Xiangling L(U); Xinhai LI; Chuanxiao XIE; Zhuanfang HAO; Hailian JI; Liyu SHI; Shihuang ZHANG

    2008-01-01

    The development of genomics and bioinfor-matics offers new tools for comparative gene mapping. In this paper, an integrated QTL map for sugarcane mosaic virus (SCMV) resistance in maize was constructed by compiling a total of 81 QTL loci available, using the Genetic Map IBM2 2005 Neighbors as reference. These 81 QTL loci were scattered on 7 chromosomes of maize, and most of them were clustered on chromosomes 3 and 6. By using the method of meta-analysis, we identified one "consensus QTL" on chromosome 3 covering a genetic distance of 6.44 cM, and two on chromosome 6 covering genetic distances of 16 cM and 27.48 cM, respectively. Four positional candidate resistant genes were identified within the "consensus QTL" on chromosome 3 via the strategy of comparative genomics. These results suggest that application of a combination of meta-analysis within a species with sequence homology comparison in a related model plant is an efficient approach to identify the major QTL and its candidate gene(s) for the target traits. The results of this study provide useful information for iden-tifying and cloning the major gene(s) conferring resistance to SCMV in maize.

  18. Bioinformatics Training: A Review of Challenges, Actions and Support Requirements

    DEFF Research Database (Denmark)

    Schneider, M.V.; Watson, J.; Attwood, T.;

    2010-01-01

    As bioinformatics becomes increasingly central to research in the molecular life sciences, the need to train non-bioinformaticians to make the most of bioinformatics resources is growing. Here, we review the key challenges and pitfalls to providing effective training for users of bioinformatics...

  19. Component-Based Approach for Educating Students in Bioinformatics

    Science.gov (United States)

    Poe, D.; Venkatraman, N.; Hansen, C.; Singh, G.

    2009-01-01

    There is an increasing need for an effective method of teaching bioinformatics. Increased progress and availability of computer-based tools for educating students have led to the implementation of a computer-based system for teaching bioinformatics as described in this paper. Bioinformatics is a recent, hybrid field of study combining elements of…

  20. Prediction and Bioinformatics Analysis of Human Gene Expression Profiling Regulated by Amifostine%依硫磷酸调控人类基因表达谱的预测及生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    杨波; 脱朝伟; 蔡力力; 迟小华; 卢学春; 张峰; 脱帅; 朱宏丽; 刘丽宏; 严江伟

    2011-01-01

    Objective of this study was to perform bioinformatics analysis of the characteristics of gene expression profiling regulated by amifostine and predict its novel potential biological function to provide a direction for further exploring pharmacological actions of amifostine and study methods. Amifostine was used as a key word to search intemet-based free gene expression database including GEO, affymetrix gene chip database, GenBank, SAGE,GeneCard, InterPro, ProtoNet, UniProt and BLOCKS and the sifted amifostine-regulated gene expression profiling data was subjected to validity testing, gene expression difference analysis and functional clustering and gene annotation. The results showed that only one data of gene expression profiling regulated by amifostine was sifted from GEO database (accession: GSE3212). Through validity testing and gene expression difference analysis, significant difference (p <0.01 ) was only found in 2.14% of the whole genome (460/192000). Gene annotation analysis showed that 139 out of 460 genes were known genes, in which 77 genes were up-regulated and 62 genes were down-regulated. 13 out of 139 genes were newly expressed following amifostine treatment of K562 cells, however expression of 5 genes was completely inhibited. Functional clustering displayed that 139 genes were divided into 1 l categories and their biological function was involved in hematopoietic and immunologic regulation, apoptosis and cell cycle. It is concluded that bioinformatics method can be applied to analysis of gene expression profiling regulated by amifostine. Amifostine has a regulatory effect on human gene expression profiling and this action is mainly presented in biological processes including hematopoiesis,immunologic regulation, apoptosis and cell cycle and so on. The effect of amifostine on human gene expression need to be further testified in experimental condition.%本研究对依硫磷酸调控人类基因表达谱进行生物信息学分析,预测其可

  1. Comparative analysis of Salmonella genomes identifies a metabolic network for escalating growth in the inflamed gut.

    Science.gov (United States)

    Nuccio, Sean-Paul; Bäumler, Andreas J

    2014-03-18

    The Salmonella genus comprises a group of pathogens associated with illnesses ranging from gastroenteritis to typhoid fever. We performed an in silico analysis of comparatively reannotated Salmonella genomes to identify genomic signatures indicative of disease potential. By removing numerous annotation inconsistencies and inaccuracies, the process of reannotation identified a network of 469 genes involved in central anaerobic metabolism, which was intact in genomes of gastrointestinal pathogens but degrading in genomes of extraintestinal pathogens. This large network contained pathways that enable gastrointestinal pathogens to utilize inflammation-derived nutrients as well as many of the biochemical reactions used for the enrichment and biochemical discrimination of Salmonella serovars. Thus, comparative genome analysis identifies a metabolic network that provides clues about the strategies for nutrient acquisition and utilization that are characteristic of gastrointestinal pathogens. IMPORTANCE While some Salmonella serovars cause infections that remain localized to the gut, others disseminate throughout the body. Here, we compared Salmonella genomes to identify characteristics that distinguish gastrointestinal from extraintestinal pathogens. We identified a large metabolic network that is functional in gastrointestinal pathogens but decaying in extraintestinal pathogens. While taxonomists have used traits from this network empirically for many decades for the enrichment and biochemical discrimination of Salmonella serovars, our findings suggest that it is part of a "business plan" for growth in the inflamed gastrointestinal tract. By identifying a large metabolic network characteristic of Salmonella serovars associated with gastroenteritis, our in silico analysis provides a blueprint for potential strategies to utilize inflammation-derived nutrients and edge out competing gut microbes.

  2. Analyses of Brucella pathogenesis, host immunity, and vaccine targets using systems biology and bioinformatics.

    Science.gov (United States)

    He, Yongqun

    2012-01-01

    Brucella is a Gram-negative, facultative intracellular bacterium that causes zoonotic brucellosis in humans and various animals. Out of 10 classified Brucella species, B. melitensis, B. abortus, B. suis, and B. canis are pathogenic to humans. In the past decade, the mechanisms of Brucella pathogenesis and host immunity have been extensively investigated using the cutting edge systems biology and bioinformatics approaches. This article provides a comprehensive review of the applications of Omics (including genomics, transcriptomics, and proteomics) and bioinformatics technologies for the analysis of Brucella pathogenesis, host immune responses, and vaccine targets. Based on more than 30 sequenced Brucella genomes, comparative genomics is able to identify gene variations among Brucella strains that help to explain host specificity and virulence differences among Brucella species. Diverse transcriptomics and proteomics gene expression studies have been conducted to analyze gene expression profiles of wild type Brucella strains and mutants under different laboratory conditions. High throughput Omics analyses of host responses to infections with virulent or attenuated Brucella strains have been focused on responses by mouse and cattle macrophages, bovine trophoblastic cells, mouse and boar splenocytes, and ram buffy coat. Differential serum responses in humans and rams to Brucella infections have been analyzed using high throughput serum antibody screening technology. The Vaxign reverse vaccinology has been used to predict many Brucella vaccine targets. More than 180 Brucella virulence factors and their gene interaction networks have been identified using advanced literature mining methods. The recent development of community-based Vaccine Ontology and Brucellosis Ontology provides an efficient way for Brucella data integration, exchange, and computer-assisted automated reasoning.

  3. Bioinformatics Mining and Modeling Methods for the Identification of Disease Mechanisms in Neurodegenerative Disorders

    Directory of Open Access Journals (Sweden)

    Martin Hofmann-Apitius

    2015-12-01

    Full Text Available Since the decoding of the Human Genome, techniques from bioinformatics, statistics, and machine learning have been instrumental in uncovering patterns in increasing amounts and types of different data produced by technical profiling technologies applied to clinical samples, animal models, and cellular systems. Yet, progress on unravelling biological mechanisms, causally driving diseases, has been limited, in part due to the inherent complexity of biological systems. Whereas we have witnessed progress in the areas of cancer, cardiovascular and metabolic diseases, the area of neurodegenerative diseases has proved to be very challenging. This is in part because the aetiology of neurodegenerative diseases such as Alzheimer´s disease or Parkinson´s disease is unknown, rendering it very difficult to discern early causal events. Here we describe a panel of bioinformatics and modeling approaches that have recently been developed to identify candidate mechanisms of neurodegenerative diseases based on publicly available data and knowledge. We identify two complementary strategies—data mining techniques using genetic data as a starting point to be further enriched using other data-types, or alternatively to encode prior knowledge about disease mechanisms in a model based framework supporting reasoning and enrichment analysis. Our review illustrates the challenges entailed in integrating heterogeneous, multiscale and multimodal information in the area of neurology in general and neurodegeneration in particular. We conclude, that progress would be accelerated by increasing efforts on performing systematic collection of multiple data-types over time from each individual suffering from neurodegenerative disease. The work presented here has been driven by project AETIONOMY; a project funded in the course of the Innovative Medicines Initiative (IMI; which is a public-private partnership of the European Federation of Pharmaceutical Industry Associations

  4. Analyses of Brucella pathogenesis, host immunity, and vaccine targets using systems biology and bioinformatics

    Directory of Open Access Journals (Sweden)

    Yongqun eHe

    2012-02-01

    Full Text Available Brucella is a Gram-negative, facultative intracellular bacterium that causes zoonotic brucellosis in humans and various animals. Out of ten classified Brucella species, B. melitensis, B. abortus, B. suis, and B. canis are pathogenic to humans. In the past decade, the mechanisms of Brucella pathogenesis and host immunity have been extensively investigated using the cutting edge systems biology and bioinformatics approaches. This article provides a comprehensive review of the applications of Omics (including genomics, transcriptomics, and proteomics and bioinformatics technologies for the analysis of Brucella pathogenesis, host immune responses, and vaccine targets. Based on more than 30 sequenced Brucella genomes, comparative genomics is able to identify gene variations among Brucella strains that help to explain host specificity and virulence differences among Brucella species. Diverse transcriptomics and proteomics gene expression studies have been conducted to analyze gene expression profiles of wild type Brucella strains and mutants under different laboratory conditions. High throughput Omics analyses of host responses to infections with virulent or attenuated Brucella strains have been focused on responses by mouse and cattle macrophages, bovine trophoblastic cells, mouse and boar splenocytes, and ram buffy coat. Differential serum responses in humans and rams to Brucella infections have been analyzed using high throughput serum antibody screening technology. The Vaxign reverse vaccinology has been used to predict many Brucella vaccine targets. More than 180 Brucella virulence factors and their gene interaction networks have been identified using advanced literature mining methods. The recent development of community-based Vaccine Ontology and Brucellosis Ontology provides an efficient way for Brucella data integration, exchange, and computer-assisted automated reasoning.

  5. 家蝇天蚕素-人溶菌酶融合蛋白的生物信息学分析%Bioinformatic analysis of Musca domestica cecropin-human lysozyme fusion protein

    Institute of Scientific and Technical Information of China (English)

    卢雪梅; 金小宝; 朱家勇; 黄演婷

    2012-01-01

    目的 利用生物信息学方法分析家蝇天蚕素-人溶菌酶融合基因推导的氨基酸序列.方法 用ProtParam Tool、CDD、ProtScal、sopma等软件对其理化性质、疏水性/亲水性、信号肽、功能结构域及蛋白质二级结构等重要参数进行预测.结果 家蝇天蚕素-人溶菌酶由l87个氨基酸组成,预测相对分子质量为20 131.7,理论等电点(pI)为9.69,分子式为C862 H1375 N277 O260 S11;半衰期预测结果显示其利于基因工程表达.融合蛋白的氨基酸序列含有天蚕素家族和溶菌酶家族二者的保守结构域,二级结构主要由α-螺旋、β-折叠、β-转角和无规则卷曲组成.结论 分析结果为家蝇天蚕素-人溶菌酶融合基因的原核表达及表达产物的生物学功能研究奠定了基础.%Objective To analyze the amino acid sequences of Musca domestica cecropin-human lysozyme (Mdc-HLY) fusion protein by bioinformatics analysis. Methods The physical-chemical properties, hydrophobicity or hydrophilicity, the signal peptide, the conserved domains and protein secondary structure of Mdc-HLY were analyzed by ProtParam Tool, CDD, ProtScal, sopma et al. Results Mdc-HLY was cationic molecules and was composed of 187 amino acids,with molecular weight of 20 131.7,theoretical PI of 9.69,the structural formula of C862H1375N277O260S11. The fusion protein included the conserved domains of both Musca domestica cecropin and human lysozyme. Instability index classified the protein as stable, and conducive to its expression using genetic engineering. The secondary structure of Mdc-HLY contained α-helix,β-sheet ,β-tum and random coil. Conclusion These results may provide foundation for further study on the expression and biological activity of Mdc-HLY.

  6. 家蝇转铁蛋白基因的克隆和生物信息学分析%Cloning and bioinformatics analysis of transferrin gene of Musca domestica (Housefly)

    Institute of Scientific and Technical Information of China (English)

    龚晓林; 张洁; 李显航; 刘红美

    2012-01-01

    为了研究家蝇转铁蛋白基因功能,获得全长cDNA序列并对其蛋白序列进行生物信息学分析,利用在线分析程序和相关工具软件分析转铁蛋白基因的开放读码框,分析编码蛋白的理化性质、结构域、并预测其空间结构和功能.结果表明家蝇转铁蛋白基因编码蛋白由622个氨基酸组成,分子量为70.58kDa,理论等电点为5.33,为稳定蛋白,有跨膜区,含有信号肽,该蛋白属于TR-FER保守结构域家族,亚细胞定位于细胞核,二级结构以无规则卷曲为主,功能预测该蛋白具有酶活性.%To obtain and perform the bioinformatics analysis of Transferrin gene of Musca domestica(Housefly) so as to provide the basis for its' function research. Internet online procedures and the related software were exploited to analyze the open reading frame ( ORF) of Transferrin gene, physical and chemical properties of Transferrin protein and the domains of the protein, and predict the space structure and functions of the protein. The Transferrin protein sequence was composed of 622 amino acids with 70. 58 kDa of the molecular weight. Theoretical isoelectric point was 5. 33. The protein was stable and had structures of transmembrane and signal peptide. The protein belonged to the family of TR - FER. The protein probably located in nucleus and dominated by random coils in second structure. The ProtFun result showed the protein has enzymatic activity.

  7. 龟裂链霉菌中 zwf 基因的克隆及其生物信息学分析%Cloning and Bioinformatics Analysis of zwf Gene in Streptomyces rimosus

    Institute of Scientific and Technical Information of China (English)

    逄春梅; 张萍; 石彦鹏; 王小娟

    2015-01-01

    利用 RT-PCR 技术、巢式 PCR 技术、3′-RACE 和5′-RACE 技术从龟裂链霉菌 K-16菌株中克隆获得zwf 基因的全长 cDNA 序列。测序结果表明:经软件 Vector NTI 11.0拼接 zwf 基因全长 cDNA 序列为1679 bp,并带有一段很短的 poly(A)尾巴,包含1347 bp 的开放读码框(ORF),编码一个含449个氨基酸残基的蛋白质。Blast2go 生物信息学分析结果表明:该基因编码的酶为葡萄糖-6-磷酸脱氢酶,通过 KEGG 代谢途径注释明确该基因编码的酶是参与谷胱甘肽代谢和磷酸戊糖途径代谢的关键酶。%The whole length of cDNA sequence of zwf gene was cloned and obtained from Streptomyces rimosus strain K-16 adopting RT-PCR,nested PCR,3′-RACE,and 5′-RACE techniques. The sequencing results showed that,the whole length of zwf gene was 1 679 bp through splicing with software of Vector NTI 11. 0,and it bore a short tail of po-ly(A),containing 1 347 bp of an open reading frame(ORF),and the code contained protein with 449 amino acids. The results of Blast2go bioinformatics analysis showed that the gene encoding enzymes was glucose-6-phosphate dehy-drogenase,it was confirmed clearly through KEGG pathways note that the genes encoded enzyme to be a key enzyme participated in glutathione metabolism and pentose phosphate pathway.

  8. 猪ATGL基因5'调控区的SNPs检测及生物信息学分析%SNPs Detection and Bioinformatics Analysis on 5'Regulatory Region of the Porcine ATGL Gene

    Institute of Scientific and Technical Information of China (English)

    华绪川; 张立凡; 蒋晓玲; 翟继鹏; 徐宁迎; 张金枝

    2011-01-01

    脂肪甘油三酯水解酶(ATGL)是脂肪组织脂肪动员过程中的水解限速酶,主要催化甘油三酯水解为甘油二酯.研究对金华猪、岔路黑猪、杜洛克、大约克和皮特兰5个猪种ATGL基因其5'调控区1.2 kb的片段进行SNPs检测和生物信息学分析.结果表明:ATGI基因5'调控区存在第-845位G→C和第-854位T→C的连锁突变.序列分析显示该区域可能存在启动子区,且2个突变都会导致其部分潜在转录因子结合位点的产生或消失.采用PCR-RFLP方法检测g-845G→C座位在金华猪、岔路黑猪、杜洛克、大约克和皮特兰中的分布情况,卡方分析结果显示,3种基因型在5个猪种中的分布存在极显著差异(P<0.01),提示不同猪种间脂肪性状的差异可能与ATGL基因5'调控区的基因突变有关.%As a key enzyme in the initial step of triglyceride hydrolysis, adipose triglyceride lipase (ATGL) plays a critical role in the lipolytic catabolism of stored fat in adipose tissue. 1.2 kb of the 5' flanking region of the porcine A TGL gene was sequenced in this study and two completely linked mutations, g-845G→C and g-854T→C, were found in the region. Results of the bioinformatics analysis indicated the presence of promoter sequence and mutations in loci g -845G→C and g-854T→C could create or destroy potential transcription factor binding sites. Locus g-845G→C were genotyped in Jinhua, Chalu black, Large Yorkshire, Duroc and Pietrain pig breeds by PCR-RFLP, and the results showed that the distribution of three genotypes was significantly different among breeds (P<0.01), which suggested that the g-845G→C mutation may contribute to diversity of fat traits in different pig breeds.

  9. Identifying Innovative Interventions to Promote Healthy Eating Using Consumption-Oriented Food Supply Chain Analysis

    OpenAIRE

    Hawkes, Corinna, ed.

    2009-01-01

    The mapping and analysis of supply chains is a technique increasingly used to address problems in the food system. Yet such supply chain management has not yet been applied as a means of encouraging healthier diets. Moreover, most policies recommended to promote healthy eating focus on the consumer end of the chain. This article proposes a consumption-oriented food supply chain analysis to identify the changes needed in the food supply chain to create a healthier food environment, measured in...

  10. An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics

    International Nuclear Information System (INIS)

    Bioinformatics researchers are increasingly confronted with analysis of ultra large-scale data sets, a problem that will only increase at an alarming rate in coming years. Recent developments in open source software, that is, the Hadoop project and associated software, provide a foundation for scaling to petabyte scale data warehouses on Linux clusters, providing fault-tolerant parallelized analysis on such data using a programming style named MapReduce. An overview is given of the current usage within the bioinformatics community of Hadoop, a top-level Apache Software Foundation project, and of associated open source software projects. The concepts behind Hadoop and the associated HBase project are defined, and current bioinformatics software that employ Hadoop is described. The focus is on next-generation sequencing, as the leading application area to date.

  11. Promoting synergistic research and education in genomics and bioinformatics.

    Science.gov (United States)

    Yang, Jack Y; Yang, Mary Qu; Zhu, Mengxia Michelle; Arabnia, Hamid R; Deng, Youping

    2008-01-01

    Bioinformatics and Genomics are closely related disciplines that hold great promises for the advancement of research and development in complex biomedical systems, as well as public health, drug design, comparative genomics, personalized medicine and so on. Research and development in these two important areas are impacting the science and technology.High throughput sequencing and molecular imaging technologies marked the beginning of a new era for modern translational medicine and personalized healthcare. The impact of having the human sequence and personalized digital images in hand has also created tremendous demands of developing powerful supercomputing, statistical learning and artificial intelligence approaches to handle the massive bioinformatics and personalized healthcare data, which will obviously have a profound effect on how biomedical research will be conducted toward the improvement of human health and prolonging of human life in the future. The International Society of Intelligent Biological Medicine (http://www.isibm.org) and its official journals, the International Journal of Functional Informatics and Personalized Medicine (http://www.inderscience.com/ijfipm) and the International Journal of Computational Biology and Drug Design (http://www.inderscience.com/ijcbdd) in collaboration with International Conference on Bioinformatics and Computational Biology (Biocomp), touch tomorrow's bioinformatics and personalized medicine throughout today's efforts in promoting the research, education and awareness of the upcoming integrated inter/multidisciplinary field. The 2007 international conference on Bioinformatics and Computational Biology (BIOCOMP07) was held in Las Vegas, the United States of American on June 25-28, 2007. The conference attracted over 400 papers, covering broad research areas in the genomics, biomedicine and bioinformatics. The Biocomp 2007 provides a common platform for the cross fertilization of ideas, and to help shape knowledge and

  12. Identifying Effective Spelling Interventions Using a Brief Experimental Analysis and Extended Analysis

    Science.gov (United States)

    McCurdy, Merilee; Clure, Lynne F.; Bleck, Amanda A.; Schmitz, Stephanie L.

    2016-01-01

    Spelling is an important skill that is crucial to effective written communication. In this study, brief experimental analysis procedures were used to examine spelling instruction strategies (e.g., whole word correction; word study strategy; positive practice; and cover, copy, and compare) for four students. In addition, an extended analysis was…

  13. Identifying Skill Requirements for GIS Positions: A Content Analysis of Job Advertisements

    Science.gov (United States)

    Hong, Jung Eun

    2016-01-01

    This study identifies the skill requirements for geographic information system (GIS) positions, including GIS analysts, programmers/developers/engineers, specialists, and technicians, through a content analysis of 946 GIS job advertisements from 2007-2014. The results indicated that GIS job applicants need to possess high levels of GIS analysis…

  14. Twelve type 2 diabetes susceptibility loci identified through large-scale association analysis

    NARCIS (Netherlands)

    B.F. Voight (Benjamin); L.J. Scott (Laura); V. Steinthorsdottir (Valgerdur); A.D. Morris (Andrew); C. Dina (Christian); R.P. Welch (Ryan); E. Zeggini (Eleftheria); C. Huth (Cornelia); Y.S. Aulchenko (Yurii); G. Thorleifsson (Gudmar); L.J. McCulloch (Laura); T. Ferreira (Teresa); H. Grallert (Harald); N. Amin (Najaf); G. Wu (Guanming); C.J. Willer (Cristen); S. Raychaudhuri (Soumya); S.A. McCarroll (Steven); C. Langenberg (Claudia); O.M. Hofmann (Oliver); J. Dupuis (Josée); L. Qi (Lu); A.V. Segrè (Ayellet); M. van Hoek (Mandy); P. Navarro (Pau); K.G. Ardlie (Kristin); B. Balkau (Beverley); R. Benediktsson (Rafn); A.J. Bennett (Amanda); R. Blagieva (Roza); E. Boerwinkle (Eric); L.L. Bonnycastle (Lori); K.B. Boström (Kristina Bengtsson); B. Bravenboer (Bert); S. Bumpstead (Suzannah); N.P. Burtt (Noël); G. Charpentier (Guillaume); P.S. Chines (Peter); M. Cornelis (Marilyn); D.J. Couper (David); G. Crawford (Gabe); A.S.F. Doney (Alex); K.S. Elliott (Katherine); M.R. Erdos (Michael); C.S. Fox (Caroline); C.S. Franklin (Christopher); M. Ganser (Martha); C. Gieger (Christian); N. Grarup (Niels); T. Green (Todd); S. Griffin (Simon); C.J. Groves (Christopher); C. Guiducci (Candace); S. Hadjadj (Samy); N. Hassanali (Neelam); C. Herder (Christian); B. Isomaa (Bo); A.U. Jackson (Anne); P.R.V. Johnson (Paul); T. Jørgensen (Torben); W.H.L. Kao (Wen); N. Klopp (Norman); A. Kong (Augustine); P. Kraft (Peter); J. Kuusisto (Johanna); T. Lauritzen (Torsten); M. Li (Man); A. Lieverse (Aloysius); C.M. Lindgren (Cecilia); V. Lyssenko (Valeriya); M. Marre (Michel); T. Meitinger (Thomas); K. Midthjell (Kristian); M.A. Morken (Mario); N. Narisu (Narisu); P. Nilsson (Peter); K.R. Owen (Katharine); F. Payne (Felicity); J.R.B. Perry (John); A.K. Petersen; C. Platou (Carl); C. Proença (Christine); I. Prokopenko (Inga); W. Rathmann (Wolfgang); N.W. Rayner (Nigel William); N.R. Robertson (Neil); G. Rocheleau (Ghislain); M. Roden (Michael); M.J. Sampson (Michael); R. Saxena (Richa); B.M. Shields (Beverley); P. Shrader (Peter); G. Sigurdsson (Gunnar); T. Sparsø (Thomas); K. Strassburger (Klaus); H.M. Stringham (Heather); Q. Sun (Qi); A.J. Swift (Amy); B. Thorand (Barbara); J. Tichet (Jean); T. Tuomi (Tiinamaija); R.M. van Dam (Rob); T.W. van Haeften (Timon); T.W. van Herpt (Thijs); J.V. van Vliet-Ostaptchouk (Jana); G.B. Walters (Bragi); M.N. Weedon (Michael); C. Wijmenga (Cisca); J.C.M. Witteman (Jacqueline); R.N. Bergman (Richard); S. Cauchi (Stephane); F.S. Collins (Francis); A.L. Gloyn (Anna); U. Gyllensten (Ulf); T. Hansen (Torben); W.A. Hide (Winston); G.A. Hitman (Graham); A. Hofman (Albert); D. Hunter (David); K. Hveem (Kristian); M. Laakso (Markku); K.L. Mohlke (Karen); C.N.A. Palmer (Colin); P.P. Pramstaller (Peter Paul); I. Rudan (Igor); E.J.G. Sijbrands (Eric); L.D. Stein (Lincoln); J. Tuomilehto (Jaakko); A.G. Uitterlinden (André); M. Walker (Mark); N.J. Wareham (Nick); G.R. Abecasis (Gonçalo); B.O. Boehm (Bernhard); H. Campbell (Harry); M.J. Daly (Mark); A.T. Hattersley (Andrew); F.B. Hu (Frank); J.B. Meigs (James); J.S. Pankow (James); O. Pedersen (Oluf); H.E. Wichmann (Erich); I. Barroso (Inês); J.C. Florez (Jose); T.M. Frayling (Timothy); L. Groop (Leif); R. Sladek (Rob); U. Thorsteinsdottir (Unnur); J.F. Wilson (James); T. Illig (Thomas); P. Froguel (Philippe); P. Tikka-Kleemola (Päivi); J-A. Zwart (John-Anker); D. Altshuler (David); M. Boehnke (Michael); M.I. McCarthy (Mark); R.M. Watanabe (Richard)

    2010-01-01

    textabstractBy combining genome-wide association data from 8,130 individuals with type 2 diabetes (T2D) and 38,987 controls of European descent and following up previously unidentified meta-analysis signals in a further 34,412 cases and 59,925 controls, we identified 12 new T2D association signals w

  15. Twelve type 2 diabetes susceptibility loci identified through large-scale association analysis

    NARCIS (Netherlands)

    Voight, Benjamin F.; Scott, Laura J.; Steinthorsdottir, Valgerdur; Morris, Andrew P.; Dina, Christian; Welch, Ryan P.; Zeggini, Eleftheria; Huth, Cornelia; Aulchenko, Yurii S.; Thorleifsson, Gudmar; McCulloch, Laura J.; Ferreira, Teresa; Grallert, Harald; Amin, Najaf; Wu, Guanming; Willer, Cristen J.; Raychaudhuri, Soumya; McCarroll, Steve A.; Langenberg, Claudia; Hofmann, Oliver M.; Dupuis, Josee; Qi, Lu; Segre, Ayellet V.; van Hoek, Mandy; Navarro, Pau; Ardlie, Kristin; Balkau, Beverley; Benediktsson, Rafn; Bennett, Amanda J.; Blagieva, Roza; Boerwinkle, Eric; Bonnycastle, Lori L.; Bostrom, Kristina Bengtsson; Bravenboer, Bert; Bumpstead, Suzannah; Burtt, Noisel P.; Charpentier, Guillaume; Chines, Peter S.; Cornelis, Marilyn; Couper, David J.; Crawford, Gabe; Doney, Alex S. F.; Elliott, Katherine S.; Elliott, Amanda L.; Erdos, Michael R.; Fox, Caroline S.; Franklin, Christopher S.; Ganser, Martha; Gieger, Christian; Grarup, Niels; Green, Todd; Griffin, Simon; Groves, Christopher J.; Guiducci, Candace; Hadjadj, Samy; Hassanali, Neelam; Herder, Christian; Isomaa, Bo; Jackson, Anne U.; Johnson, Paul R. V.; Jorgensen, Torben; Kao, Wen H. L.; Klopp, Norman; Kong, Augustine; Kraft, Peter; Kuusisto, Johanna; Lauritzen, Torsten; Li, Man; Lieverse, Aloysius; Lindgren, Cecilia M.; Lyssenko, Valeriya; Marre, Michel; Meitinger, Thomas; Midthjell, Kristian; Morken, Mario A.; Narisu, Narisu; Nilsson, Peter; Owen, Katharine R.; Payne, Felicity; Perry, John R. B.; Petersen, Ann-Kristin; Platou, Carl; Proenca, Christine; Prokopenko, Inga; Rathmann, Wolfgang; Rayner, N. William; Robertson, Neil R.; Rocheleau, Ghislain; Roden, Michael; Sampson, Michael J.; Saxena, Richa; Shields, Beverley M.; Shrader, Peter; Sigurdsson, Gunnar; Sparso, Thomas; Strassburger, Klaus; Stringham, Heather M.; Sun, Qi; Swift, Amy J.; Thorand, Barbara; Tichet, Jean; Tuomi, Tiinamaija; van Dam, Rob M.; van Haeften, Timon W.; van Herpt, Thijs; van Vliet-Ostaptchouk, Jana V.; Walters, G. Bragi; Weedon, Michael N.; Wijmenga, Cisca; Witteman, Jacqueline; Bergman, Richard N.; Cauchi, Stephane; Collins, Francis S.; Gloyn, Anna L.; Gyllensten, Ulf; Hansen, Torben; Hide, Winston A.; Hitman, Graham A.; Hofman, Albert; Hunter, David J.; Hveem, Kristian; Laakso, Markku; Mohlke, Karen L.; Morris, Andrew D.; Palmer, Colin N. A.; Pramstaller, Peter P.; Rudan, Igor; Sijbrands, Eric; Stein, Lincoln D.; Tuomilehto, Jaakko; Uitterlinden, Andre; Walker, Mark; Wareham, Nicholas J.; Watanabe, Richard M.; Abecasis, Goncalo R.; Boehm, Bernhard O.; Campbell, Harry; Daly, Mark J.; Hattersley, Andrew T.; Hu, Frank B.; Meigs, James B.; Pankow, James S.; Pedersen, Oluf; Wichmann, H-Erich; Barroso, Ines; Florez, Jose C.; Frayling, Timothy M.; Groop, Leif; Sladek, Rob; Thorsteinsdottir, Unnur; Wilson, James F.; Illig, Thomas; Froguel, Philippe; van Duijn, Cornelia M.; Stefansson, Kari; Altshuler, David; Boehnke, Michael; McCarthy, Mark I.

    2010-01-01

    By combining genome-wide association data from 8,130 individuals with type 2 diabetes (T2D) and 38,987 controls of European descent and following up previously unidentified meta-analysis signals in a further 34,412 cases and 59,925 controls, we identified 12 new T2D association signals with combined

  16. Genome-wide association study meta-analysis