WorldWideScience

Sample records for bioinformatics analysis identifies

  1. Bioinformatics analysis identifies several intrinsically disordered human E3 ubiquitin-protein ligases

    DEFF Research Database (Denmark)

    Boomsma, Wouter; Nielsen, Sofie V; Lindorff-Larsen, Kresten;

    2016-01-01

    conduct a bioinformatics analysis to examine >600 human and S. cerevisiae E3 ligases to identify enzymes that are similar to San1 in terms of function and/or mechanism of substrate recognition. An initial sequence-based database search was found to detect candidates primarily based on the homology of...

  2. Analysis of regulatory protease sequences identified through bioinformatic data mining of the Schistosoma mansoni genome

    Directory of Open Access Journals (Sweden)

    Minchella Dennis J

    2009-10-01

    Full Text Available Abstract Background New chemotherapeutic agents against Schistosoma mansoni, an etiological agent of human schistosomiasis, are a priority due to the emerging drug resistance and the inability of current drug treatments to prevent reinfection. Proteases have been under scrutiny as targets of immunological or chemotherapeutic anti-Schistosoma agents because of their vital role in many stages of the parasitic life cycle. Function has been established for only a handful of identified S. mansoni proteases, and the vast majority of these are the digestive proteases; very few of the conserved classes of regulatory proteases have been identified from Schistosoma species, despite their vital role in numerous cellular processes. To that end, we identified protease protein coding genes from the S. mansoni genome project and EST library. Results We identified 255 protease sequences from five catalytic classes using predicted proteins of the S. mansoni genome. The vast majority of these show significant similarity to proteins in KEGG and the Conserved Domain Database. Proteases include calpains, caspases, cytosolic and mitochondrial signal peptidases, proteases that interact with ubiquitin and ubiquitin-like molecules, and proteases that perform regulated intramembrane proteolysis. Comparative analysis of classes of important regulatory proteases find conserved active site domains, and where appropriate, signal peptides and transmembrane helices. Phylogenetic analysis provides support for inferring functional divergence among regulatory aspartic, cysteine, and serine proteases. Conclusion Numerous proteases are identified for the first time in S. mansoni. We characterized important regulatory proteases and focus analysis on these proteases to complement the growing knowledge base of digestive proteases. This work provides a foundation for expanding knowledge of proteases in Schistosoma species and examining their diverse function and potential as targets

  3. Analysis of regulatory protease sequences identified through bioinformatic data mining of the Schistosoma mansoni genome

    OpenAIRE

    Minchella Dennis J; Mayfield Chris; Bos David H

    2009-01-01

    Abstract Background New chemotherapeutic agents against Schistosoma mansoni, an etiological agent of human schistosomiasis, are a priority due to the emerging drug resistance and the inability of current drug treatments to prevent reinfection. Proteases have been under scrutiny as targets of immunological or chemotherapeutic anti-Schistosoma agents because of their vital role in many stages of the parasitic life cycle. Function has been established for only a handful of identified S. mansoni ...

  4. Hypothetical granulin-like molecule from Fasciola hepatica identified by bioinformatics analysis.

    Science.gov (United States)

    Machicado, Claudia; Marcos, Luis A; Zimic, Mirko

    2016-01-01

    Fasciola hepatica is considered an emergent human pathogen, causing liver fibrosis or cirrhosis, conditions that are known to be direct causes of cancer. Some parasites have been categorized by WHO as carcinogenic agents such as Opisthorchis viverrini, a relative of F. hepatica. Although these two parasites are from the same class (Trematoda), the role of F. hepatica in carcinogenesis is unclear. We hypothesized that F. hepatica might share some features with O. viverrini and to be responsible to induce proliferation of host cells. We analyzed the recently released genome of F. hepatica looking for a gene coding a granulin-like growth factor, a protein secreted by O. viverrini (Ov-GRN-1), which is a potent stimulator of proliferation of host cells. Using computational biology tools, we identified a granulin-like molecule in F. hepatica, here termed FhGLM, which has high sequence identity level to Ov-GRN-1 and human progranulin. We found evidence of an upstream promoter compatible with the expression of FhGLM. The FhGLM architecture showed to have five granulin domains, one of them, the domain 3, was homologue to Ov-GRN-1 and human GRNC. The structure of the FhGLM granulin domain 3 resulted to have the overall folding of its homologue the human GRNC. Our findings show the presence of a homologue of a potent modulator of cell growth in F. hepatica that might have, as other granulins, a proliferative action on host cells during fascioliasis. Future experimental assays to demonstrate the presence of FhGLM in F. hepatica are needed to confirm our hypothesis. PMID:27386259

  5. Putative lipoproteins identified by bioinformatic genome analysis of Leifsonia xyli ssp. xyli, the causative agent of sugarcane ratoon stunting disease.

    Science.gov (United States)

    Sutcliffe, Iain C; Hutchings, Matthew I

    2007-01-01

    SUMMARY Leifsonia xyli ssp. xyli is the causative agent of ratoon stunting disease, a major cause of economic loss in sugarcane crops. Understanding of the biology of this pathogen has been hampered by its fastidious growth characteristics in vitro. However, the recent release of a genome sequence for this organism has allowed significant novel insights. Further to this, we have performed a bioinformatic analysis of the lipoproteins encoded in the L. xyli genome. These analyses suggest that lipoproteins represent c. 2.0% of the L. xyli predicted proteome. Functional analyses suggest that lipoproteins make an important contribution to the physiology of the pathogen and may influence its ability to cause disease in planta. PMID:20507484

  6. Somatic mutation profiles of MSI and MSS colorectal cancer identified by whole exome next generation sequencing and bioinformatics analysis.

    Directory of Open Access Journals (Sweden)

    Bernd Timmermann

    Full Text Available BACKGROUND: Colorectal cancer (CRC is with approximately 1 million cases the third most common cancer worldwide. Extensive research is ongoing to decipher the underlying genetic patterns with the hope to improve early cancer diagnosis and treatment. In this direction, the recent progress in next generation sequencing technologies has revolutionized the field of cancer genomics. However, one caveat of these studies remains the large amount of genetic variations identified and their interpretation. METHODOLOGY/PRINCIPAL FINDINGS: Here we present the first work on whole exome NGS of primary colon cancers. We performed 454 whole exome pyrosequencing of tumor as well as adjacent not affected normal colonic tissue from microsatellite stable (MSS and microsatellite instable (MSI colon cancer patients and identified more than 50,000 small nucleotide variations for each tissue. According to predictions based on MSS and MSI pathomechanisms we identified eight times more somatic non-synonymous variations in MSI cancers than in MSS and we were able to reproduce the result in four additional CRCs. Our bioinformatics filtering approach narrowed down the rate of most significant mutations to 359 for MSI and 45 for MSS CRCs with predicted altered protein functions. In both CRCs, MSI and MSS, we found somatic mutations in the intracellular kinase domain of bone morphogenetic protein receptor 1A, BMPR1A, a gene where so far germline mutations are associated with juvenile polyposis syndrome, and show that the mutations functionally impair the protein function. CONCLUSIONS/SIGNIFICANCE: We conclude that with deep sequencing of tumor exomes one may be able to predict the microsatellite status of CRC and in addition identify potentially clinically relevant mutations.

  7. Bioinformatics approaches for identifying new therapeutic bioactive peptides in food

    Directory of Open Access Journals (Sweden)

    Nora Khaldi

    2012-10-01

    Full Text Available ABSTRACT:The traditional methods for mining foods for bioactive peptides are tedious and long. Similar to the drug industry, the length of time to identify and deliver a commercial health ingredient that reduces disease symptoms can take anything between 5 to 10 years. Reducing this time and effort is crucial in order to create new commercially viable products with clear and important health benefits. In the past few years, bioinformatics, the science that brings together fast computational biology, and efficient genome mining, is appearing as the long awaited solution to this problem. By quickly mining food genomes for characteristics of certain food therapeutic ingredients, researchers can potentially find new ones in a matter of a few weeks. Yet, surprisingly, very little success has been achieved so far using bioinformatics in mining for food bioactives.The absence of food specific bioinformatic mining tools, the slow integration of both experimental mining and bioinformatics, and the important difference between different experimental platforms are some of the reasons for the slow progress of bioinformatics in the field of functional food and more specifically in bioactive peptide discovery.In this paper I discuss some methods that could be easily translated, using a rational peptide bioinformatics design, to food bioactive peptide mining. I highlight the need for an integrated food peptide database. I also discuss how to better integrate experimental work with bioinformatics in order to improve the mining of food for bioactive peptides, therefore achieving a higher success rates.

  8. Bioinformatics

    DEFF Research Database (Denmark)

    Baldi, Pierre; Brunak, Søren

    medicine will be particularly affected by the new results and the increased understanding of life at the molecular level. Bioinformatics is the development and application of computer methods for analysis, interpretation, and prediction, as well as for the design of experiments. It has emerged as a...

  9. Bioinformatic analysis of neurotropic HIV envelope sequences identifies polymorphisms in the gp120 bridging sheet that increase macrophage-tropism through enhanced interactions with CCR5

    Energy Technology Data Exchange (ETDEWEB)

    Mefford, Megan E., E-mail: megan_mefford@hms.harvard.edu [Department of Cancer Immunology and AIDS, Dana-Farber Cancer Institute, Boston, MA (United States); Kunstman, Kevin, E-mail: kunstman@northwestern.edu [Northwestern University Medical School, Chicago, IL (United States); Wolinsky, Steven M., E-mail: s-wolinsky@northwestern.edu [Northwestern University Medical School, Chicago, IL (United States); Gabuzda, Dana, E-mail: dana_gabuzda@dfci.harvard.edu [Department of Cancer Immunology and AIDS, Dana-Farber Cancer Institute, Boston, MA (United States); Department of Neurology (Microbiology and Immunobiology), Harvard Medical School, Boston, MA (United States)

    2015-07-15

    Macrophages express low levels of the CD4 receptor compared to T-cells. Macrophage-tropic HIV strains replicating in brain of untreated patients with HIV-associated dementia (HAD) express Envs that are adapted to overcome this restriction through mechanisms that are poorly understood. Here, bioinformatic analysis of env sequence datasets together with functional studies identified polymorphisms in the β3 strand of the HIV gp120 bridging sheet that increase M-tropism. D197, which results in loss of an N-glycan located near the HIV Env trimer apex, was detected in brain in some HAD patients, while position 200 was estimated to be under positive selection. D197 and T/V200 increased fusion and infection of cells expressing low CD4 by enhancing gp120 binding to CCR5. These results identify polymorphisms in the HIV gp120 bridging sheet that overcome the restriction to macrophage infection imposed by low CD4 through enhanced gp120–CCR5 interactions, thereby promoting infection of brain and other macrophage-rich tissues. - Highlights: • We analyze HIV Env sequences and identify amino acids in beta 3 of the gp120 bridging sheet that enhance macrophage tropism. • These amino acids at positions 197 and 200 are present in brain of some patients with HIV-associated dementia. • D197 results in loss of a glycan near the HIV Env trimer apex, which may increase exposure of V3. • These variants may promote infection of macrophages in the brain by enhancing gp120–CCR5 interactions.

  10. Bioinformatic analysis of neurotropic HIV envelope sequences identifies polymorphisms in the gp120 bridging sheet that increase macrophage-tropism through enhanced interactions with CCR5

    International Nuclear Information System (INIS)

    Macrophages express low levels of the CD4 receptor compared to T-cells. Macrophage-tropic HIV strains replicating in brain of untreated patients with HIV-associated dementia (HAD) express Envs that are adapted to overcome this restriction through mechanisms that are poorly understood. Here, bioinformatic analysis of env sequence datasets together with functional studies identified polymorphisms in the β3 strand of the HIV gp120 bridging sheet that increase M-tropism. D197, which results in loss of an N-glycan located near the HIV Env trimer apex, was detected in brain in some HAD patients, while position 200 was estimated to be under positive selection. D197 and T/V200 increased fusion and infection of cells expressing low CD4 by enhancing gp120 binding to CCR5. These results identify polymorphisms in the HIV gp120 bridging sheet that overcome the restriction to macrophage infection imposed by low CD4 through enhanced gp120–CCR5 interactions, thereby promoting infection of brain and other macrophage-rich tissues. - Highlights: • We analyze HIV Env sequences and identify amino acids in beta 3 of the gp120 bridging sheet that enhance macrophage tropism. • These amino acids at positions 197 and 200 are present in brain of some patients with HIV-associated dementia. • D197 results in loss of a glycan near the HIV Env trimer apex, which may increase exposure of V3. • These variants may promote infection of macrophages in the brain by enhancing gp120–CCR5 interactions

  11. Coronavirus Genomics and Bioinformatics Analysis

    Directory of Open Access Journals (Sweden)

    Kwok-Yung Yuen

    2010-08-01

    Full Text Available The drastic increase in the number of coronaviruses discovered and coronavirus genomes being sequenced have given us an unprecedented opportunity to perform genomics and bioinformatics analysis on this family of viruses. Coronaviruses possess the largest genomes (26.4 to 31.7 kb among all known RNA viruses, with G + C contents varying from 32% to 43%. Variable numbers of small ORFs are present between the various conserved genes (ORF1ab, spike, envelope, membrane and nucleocapsid and downstream to nucleocapsid gene in different coronavirus lineages. Phylogenetically, three genera, Alphacoronavirus, Betacoronavirus and Gammacoronavirus, with Betacoronavirus consisting of subgroups A, B, C and D, exist. A fourth genus, Deltacoronavirus, which includes bulbul coronavirus HKU11, thrush coronavirus HKU12 and munia coronavirus HKU13, is emerging. Molecular clock analysis using various gene loci revealed that the time of most recent common ancestor of human/civet SARS related coronavirus to be 1999-2002, with estimated substitution rate of 4´10-4 to 2´10-2 substitutions per site per year. Recombination in coronaviruses was most notable between different strains of murine hepatitis virus (MHV, between different strains of infectious bronchitis virus, between MHV and bovine coronavirus, between feline coronavirus (FCoV type I and canine coronavirus generating FCoV type II, and between the three genotypes of human coronavirus HKU1 (HCoV-HKU1. Codon usage bias in coronaviruses were observed, with HCoV-HKU1 showing the most extreme bias, and cytosine deamination and selection of CpG suppressed clones are the two major independent biological forces that shape such codon usage bias in coronaviruses.

  12. Bioinformatics approaches for identifying new therapeutic bioactive peptides in food

    OpenAIRE

    Nora Khaldi

    2012-01-01

    ABSTRACT:The traditional methods for mining foods for bioactive peptides are tedious and long. Similar to the drug industry, the length of time to identify and deliver a commercial health ingredient that reduces disease symptoms can take anything between 5 to 10 years. Reducing this time and effort is crucial in order to create new commercially viable products with clear and important health benefits. In the past few years, bioinformatics, the science that brings together fast computational b...

  13. KDE Bioscience: platform for bioinformatics analysis workflows.

    Science.gov (United States)

    Lu, Qiang; Hao, Pei; Curcin, Vasa; He, Weizhong; Li, Yuan-Yuan; Luo, Qing-Ming; Guo, Yi-Ke; Li, Yi-Xue

    2006-08-01

    Bioinformatics is a dynamic research area in which a large number of algorithms and programs have been developed rapidly and independently without much consideration so far of the need for standardization. The lack of such common standards combined with unfriendly interfaces make it difficult for biologists to learn how to use these tools and to translate the data formats from one to another. Consequently, the construction of an integrative bioinformatics platform to facilitate biologists' research is an urgent and challenging task. KDE Bioscience is a java-based software platform that collects a variety of bioinformatics tools and provides a workflow mechanism to integrate them. Nucleotide and protein sequences from local flat files, web sites, and relational databases can be entered, annotated, and aligned. Several home-made or 3rd-party viewers are built-in to provide visualization of annotations or alignments. KDE Bioscience can also be deployed in client-server mode where simultaneous execution of the same workflow is supported for multiple users. Moreover, workflows can be published as web pages that can be executed from a web browser. The power of KDE Bioscience comes from the integrated algorithms and data sources. With its generic workflow mechanism other novel calculations and simulations can be integrated to augment the current sequence analysis functions. Because of this flexible and extensible architecture, KDE Bioscience makes an ideal integrated informatics environment for future bioinformatics or systems biology research. PMID:16260186

  14. Integrative cluster analysis in bioinformatics

    CERN Document Server

    Abu-Jamous, Basel; Nandi, Asoke K

    2015-01-01

    Clustering techniques are increasingly being put to use in the analysis of high-throughput biological datasets. Novel computational techniques to analyse high throughput data in the form of sequences, gene and protein expressions, pathways, and images are becoming vital for understanding diseases and future drug discovery. This book details the complete pathway of cluster analysis, from the basics of molecular biology to the generation of biological knowledge. The book also presents the latest clustering methods and clustering validation, thereby offering the reader a comprehensive review o

  15. Using Bioinformatic Approaches to Identify Pathways Targeted by Human Leukemogens

    Directory of Open Access Journals (Sweden)

    Luoping Zhang

    2012-07-01

    Full Text Available We have applied bioinformatic approaches to identify pathways common to chemical leukemogens and to determine whether leukemogens could be distinguished from non-leukemogenic carcinogens. From all known and probable carcinogens classified by IARC and NTP, we identified 35 carcinogens that were associated with leukemia risk in human studies and 16 non-leukemogenic carcinogens. Using data on gene/protein targets available in the Comparative Toxicogenomics Database (CTD for 29 of the leukemogens and 11 of the non-leukemogenic carcinogens, we analyzed for enrichment of all 250 human biochemical pathways in the Kyoto Encyclopedia of Genes and Genomes (KEGG database. The top pathways targeted by the leukemogens included metabolism of xenobiotics by cytochrome P450, glutathione metabolism, neurotrophin signaling pathway, apoptosis, MAPK signaling, Toll-like receptor signaling and various cancer pathways. The 29 leukemogens formed 18 distinct clusters comprising 1 to 3 chemicals that did not correlate with known mechanism of action or with structural similarity as determined by 2D Tanimoto coefficients in the PubChem database. Unsupervised clustering and one-class support vector machines, based on the pathway data, were unable to distinguish the 29 leukemogens from 11 non-leukemogenic known and probable IARC carcinogens. However, using two-class random forests to estimate leukemogen and non-leukemogen patterns, we estimated a 76% chance of distinguishing a random leukemogen/non-leukemogen pair from each other.

  16. Applied bioinformatics: Genome annotation and transcriptome analysis

    DEFF Research Database (Denmark)

    Gupta, Vikas

    Next generation sequencing (NGS) has revolutionized the field of genomics and its wide range of applications has resulted in the genome-wide analysis of hundreds of species and the development of thousands of computational tools. This thesis represents my work on NGS analysis of four species, Lotus...... japonicus (Lotus), Vaccinium corymbosum (blueberry), Stegodyphus mimosarum (spider) and Trifolium occidentale (clover). From a bioinformatics data analysis perspective, my work can be divided into three parts; genome annotation, small RNA, and gene expression analysis. Lotus is a legume of significant...... agricultural and biological importance. Its capacity to form symbiotic relationships with rhizobia and microrrhizal fungi has fascinated researchers for years. Lotus has a small genome of approximately 470 Mb and a short life cycle of 2 to 3 months, which has made Lotus a model legume plant for many molecular...

  17. Bioinformatics Analysis of Estrogen-Responsive Genes.

    Science.gov (United States)

    Handel, Adam E

    2016-01-01

    Estrogen is a steroid hormone that plays critical roles in a myriad of intracellular pathways. The expression of many genes is regulated through the steroid hormone receptors ESR1 and ESR2. These bind to DNA and modulate the expression of target genes. Identification of estrogen target genes is greatly facilitated by the use of transcriptomic methods, such as RNA-seq and expression microarrays, and chromatin immunoprecipitation with massively parallel sequencing (ChIP-seq). Combining transcriptomic and ChIP-seq data enables a distinction to be drawn between direct and indirect estrogen target genes. This chapter discusses some methods of identifying estrogen target genes that do not require any expertise in programming languages or complex bioinformatics. PMID:26585125

  18. Bioinformatics analysis of metastasis-related proteins in hepatocellular carcinoma

    Institute of Scientific and Technical Information of China (English)

    Pei-Ming Song; Yang Zhang; Yu-Fei He; Hui-Min Bao; Jian-Hua Luo; Yin-Kun Liu; Peng-Yuan Yang; Xian Chen

    2008-01-01

    AIM: To analyze the metastasis-related proteins in hepatocellular carcinoma (HCC) and discover the biomark-er candidates for diagnosis and therapeutic intervention of HCC metastasis with bioinformatics tools.METHODS: Metastasis-related proteins were determined by stable isotope labeling and MS analysis and analyzed with bioinformatics resources, including Phobius, Kyoto encyclopedia of genes and genomes (KEGG), online mendelian inheritance in man (OHIH) and human protein reference database (HPRD).RESULTS: All the metastasis-related proteins were linked to 83 pathways in KEGG, including MAPK and p53 signal pathways. Protein-protein interaction network showed that all the metastasis-related proteins were categorized into 19 function groups, including cell cycle, apoptosis and signal transcluction. OMIM analysis linked these proteins to 186 OMIM entries.CONCLUSION: Metastasis-related proteins provide HCC cells with biological advantages in cell proliferation, migration and angiogenesis, and facilitate metastasis of HCC cells. The bird's eye view can reveal a global charac-teristic of metastasis-related proteins and many differen-tially expressed proteins can be identified as candidates for diagnosis and treatment of HCC.

  19. [Expression of bioinformatically identified genes in skin of psoriasis patients].

    Science.gov (United States)

    2013-10-01

    Gene expression analysis for EPHA2 (EPH receptor A2), EPHB2 (EPH receptor B2), S100A9 (S100 calcium binding protein A9), PBEF(nicotinamide phosphoribosyltransferase), LILRB2 (leukocyte immunoglobulin-like receptor, subfamily B (with TM and ITIM domains), member 2), PLAUR (plasminogen activator, urokinase receptor), LTB (lymphotoxin beta (TNF superfamily, member 3)), WNT5A (wingless-type MMTV integration site family, member 5A) has been conducted using real-time polymerase chain reaction in specimens affected by psoriasis versus visually intact skin in 18 patients. It was revealed that the expression of the nine examined genes was upregulated in the affected by psoriasis compared to visually intact skin specimens. The highest expression was observed for S100A9, S100AS, PBEF, WNT5A2, and EPHB2 genes. S100A9 and S100A8 gene expression in the affected by psoriasis skin was 100-fold higher versus visually intact skin while PBEF, WNT5A, and EPHB2 gene expression was upregulated more than five-fold. We suggested that the high expression of these genes might be associated with the state of the pathological process in psoriasis. Moreover, the transcriptional activity of these genes might serve a molecular indicator of the efficacy of treatment in psoriasis. PMID:25508677

  20. Bioinformatics analysis of disordered proteins in prokaryotes

    Directory of Open Access Journals (Sweden)

    Malkov Saša N

    2011-03-01

    Full Text Available Abstract Background A significant number of proteins have been shown to be intrinsically disordered, meaning that they lack a fixed 3 D structure or contain regions that do not posses a well defined 3 D structure. It has also been proven that a protein's disorder content is related to its function. We have performed an exhaustive analysis and comparison of the disorder content of proteins from prokaryotic organisms (i.e., superkingdoms Archaea and Bacteria with respect to functional categories they belong to, i.e., Clusters of Orthologous Groups of proteins (COGs and groups of COGs-Cellular processes (Cp, Information storage and processing (Isp, Metabolism (Me and Poorly characterized (Pc. We also analyzed the disorder content of proteins with respect to various genomic, metabolic and ecological characteristics of the organism they belong to. We used correlations and association rule mining in order to identify the most confident associations between specific modalities of the characteristics considered and disorder content. Results Bacteria are shown to have a somewhat higher level of protein disorder than archaea, except for proteins in the Me functional group. It is demonstrated that the Isp and Cp functional groups in particular (L-repair function and N-cell motility and secretion COGs of proteins in specific possess the highest disorder content, while Me proteins, in general, posses the lowest. Disorder fractions have been confirmed to have the lowest level for the so-called order-promoting amino acids and the highest level for the so-called disorder promoters. For each pair of organism characteristics, specific modalities are identified with the maximum disorder proteins in the corresponding organisms, e.g., high genome size-high GC content organisms, facultative anaerobic-low GC content organisms, aerobic-high genome size organisms, etc. Maximum disorder in archaea is observed for high GC content-low genome size organisms, high GC content

  1. In-depth analysis of the adipocyte proteome by mass spectrometry and bioinformatics

    DEFF Research Database (Denmark)

    Adachi, Jun; Kumar, Chanchal; Zhang, Yanling;

    2007-01-01

    , mitochondria, membrane, and cytosol of 3T3-L1 adipocytes. We identified 3,287 proteins while essentially eliminating false positives, making this one of the largest high confidence proteomes reported to date. Comprehensive bioinformatics analysis revealed that the adipocyte proteome, despite its specialized...

  2. Bioinformatics Analysis of MAPKKK Family Genes in Medicago truncatula

    Directory of Open Access Journals (Sweden)

    Wei Li

    2016-04-01

    Full Text Available Mitogen‐activated protein kinase kinase kinase (MAPKKK is a component of the MAPK cascade pathway that plays an important role in plant growth, development, and response to abiotic stress, the functions of which have been well characterized in several plant species, such as Arabidopsis, rice, and maize. In this study, we performed genome‐wide and systemic bioinformatics analysis of MAPKKK family genes in Medicago truncatula. In total, there were 73 MAPKKK family members identified by search of homologs, and they were classified into three subfamilies, MEKK, ZIK, and RAF. Based on the genomic duplication function, 72 MtMAPKKK genes were located throughout all chromosomes, but they cluster in different chromosomes. Using microarray data and high‐throughput sequencing‐data, we assessed their expression profiles in growth and development processes; these results provided evidence for exploring their important functions in developmental regulation, especially in the nodulation process. Furthermore, we investigated their expression in abiotic stresses by RNA‐seq, which confirmed their critical roles in signal transduction and regulation processes under stress. In summary, our genome‐wide, systemic characterization and expressional analysis of MtMAPKKK genes will provide insights that will be useful for characterizing the molecular functions of these genes in M. truncatula.

  3. Whale song analyses using bioinformatics sequence analysis approaches

    Science.gov (United States)

    Chen, Yian A.; Almeida, Jonas S.; Chou, Lien-Siang

    2005-04-01

    Animal songs are frequently analyzed using discrete hierarchical units, such as units, themes and songs. Because animal songs and bio-sequences may be understood as analogous, bioinformatics analysis tools DNA/protein sequence alignment and alignment-free methods are proposed to quantify the theme similarities of the songs of false killer whales recorded off northeast Taiwan. The eighteen themes with discrete units that were identified in an earlier study [Y. A. Chen, masters thesis, University of Charleston, 2001] were compared quantitatively using several distance metrics. These metrics included the scores calculated using the Smith-Waterman algorithm with the repeated procedure; the standardized Euclidian distance and the angle metrics based on word frequencies. The theme classifications based on different metrics were summarized and compared in dendrograms using cluster analyses. The results agree with earlier classifications derived by human observation qualitatively. These methods further quantify the similarities among themes. These methods could be applied to the analyses of other animal songs on a larger scale. For instance, these techniques could be used to investigate song evolution and cultural transmission quantifying the dissimilarities of humpback whale songs across different seasons, years, populations, and geographic regions. [Work supported by SC Sea Grant, and Ilan County Government, Taiwan.

  4. Bioinformatics and Microarray Data Analysis on the Cloud.

    Science.gov (United States)

    Calabrese, Barbara; Cannataro, Mario

    2016-01-01

    High-throughput platforms such as microarray, mass spectrometry, and next-generation sequencing are producing an increasing volume of omics data that needs large data storage and computing power. Cloud computing offers massive scalable computing and storage, data sharing, on-demand anytime and anywhere access to resources and applications, and thus, it may represent the key technology for facing those issues. In fact, in the recent years it has been adopted for the deployment of different bioinformatics solutions and services both in academia and in the industry. Although this, cloud computing presents several issues regarding the security and privacy of data, that are particularly important when analyzing patients data, such as in personalized medicine. This chapter reviews main academic and industrial cloud-based bioinformatics solutions; with a special focus on microarray data analysis solutions and underlines main issues and problems related to the use of such platforms for the storage and analysis of patients data. PMID:25863787

  5. CAPweb: a bioinformatics CGH array Analysis Platform

    OpenAIRE

    Liva, Stéphane; Hupé, Philippe; Neuvial, Pierre; Brito, Isabel; Viara, Eric; La Rosa, Philippe; Barillot, Emmanuel

    2006-01-01

    Assessing variations in DNA copy number is crucial for understanding constitutional or somatic diseases, particularly cancers. The recently developed array-CGH (comparative genomic hybridization) technology allows this to be investigated at the genomic level. We report the availability of a web tool for analysing array-CGH data. CAPweb (CGH array Analysis Platform on the Web) is intended as a user-friendly tool enabling biologists to completely analyse CGH arrays from the raw data to the visu...

  6. Biochip microsystem for bioinformatics recognition and analysis

    Science.gov (United States)

    Lue, Jaw-Chyng (Inventor); Fang, Wai-Chi (Inventor)

    2011-01-01

    A system with applications in pattern recognition, or classification, of DNA assay samples. Because DNA reference and sample material in wells of an assay may be caused to fluoresce depending upon dye added to the material, the resulting light may be imaged onto an embodiment comprising an array of photodetectors and an adaptive neural network, with applications to DNA analysis. Other embodiments are described and claimed.

  7. Bioinformatics Analysis of Zinc Transporter from Baoding Alfalfa

    Institute of Scientific and Technical Information of China (English)

    Haibo WANG; Junyun GUO

    2012-01-01

    [Objective] This study aimed to perform the bioinformatics analysis of Zinc transporter (ZnT) from Baoding Alfalfa. [Method] Based on the amino acid sequence, the physical and chemical properties, hydrophilicity/hydrophobicity, secondary structure of ZnT from Baoding alfalfa were predicted by a series of bioinformatics software. And the transmembrane domains were predicted by using different online tools. [Result] ZnT is a hydrophobic protein containing 408 amino acids with the theoretical pl of 5.94, and it has 7 potential transmembrane hydrophobic regions. In the sec- ondary structure, co-helix (Hh) accounted for 48.04%, extended strand (Ee) for 9.56%, random coil (Cc) for 42.40%, which was accored with the characteristic of transmembrane protein. [Conclusion] mZnT is a member of CDF family, responsible for transporting Zn^2+ out of the cell membrane to reduce the concentration and toxicity of Zn^2+.

  8. Bioinformatic Identification and Analysis of Extensins in the Plant Kingdom.

    Directory of Open Access Journals (Sweden)

    Xiao Liu

    Full Text Available Extensins (EXTs are a family of plant cell wall hydroxyproline-rich glycoproteins (HRGPs that are implicated to play important roles in plant growth, development, and defense. Structurally, EXTs are characterized by the repeated occurrence of serine (Ser followed by three to five prolines (Pro residues, which are hydroxylated as hydroxyproline (Hyp and glycosylated. Some EXTs have Tyrosine (Tyr-X-Tyr (where X can be any amino acid motifs that are responsible for intramolecular or intermolecular cross-linkings. EXTs can be divided into several classes: classical EXTs, short EXTs, leucine-rich repeat extensins (LRXs, proline-rich extensin-like receptor kinases (PERKs, formin-homolog EXTs (FH EXTs, chimeric EXTs, and long chimeric EXTs. To guide future research on the EXTs and understand evolutionary history of EXTs in the plant kingdom, a bioinformatics study was conducted to identify and classify EXTs from 16 fully sequenced plant genomes, including Ostreococcus lucimarinus, Chlamydomonas reinhardtii, Volvox carteri, Klebsormidium flaccidum, Physcomitrella patens, Selaginella moellendorffii, Pinus taeda, Picea abies, Brachypodium distachyon, Zea mays, Oryza sativa, Glycine max, Medicago truncatula, Brassica rapa, Solanum lycopersicum, and Solanum tuberosum, to supplement data previously obtained from Arabidopsis thaliana and Populus trichocarpa. A total of 758 EXTs were newly identified, including 87 classical EXTs, 97 short EXTs, 61 LRXs, 75 PERKs, 54 FH EXTs, 38 long chimeric EXTs, and 346 other chimeric EXTs. Several notable findings were made: (1 classical EXTs were likely derived after the terrestrialization of plants; (2 LRXs, PERKs, and FHs were derived earlier than classical EXTs; (3 monocots have few classical EXTs; (4 Eudicots have the greatest number of classical EXTs and Tyr-X-Tyr cross-linking motifs are predominantly in classical EXTs; (5 green algae have no classical EXTs but have a number of long chimeric EXTs that are absent in

  9. Bioinformatic Identification and Analysis of Extensins in the Plant Kingdom.

    Science.gov (United States)

    Liu, Xiao; Wolfe, Richard; Welch, Lonnie R; Domozych, David S; Popper, Zoë A; Showalter, Allan M

    2016-01-01

    Extensins (EXTs) are a family of plant cell wall hydroxyproline-rich glycoproteins (HRGPs) that are implicated to play important roles in plant growth, development, and defense. Structurally, EXTs are characterized by the repeated occurrence of serine (Ser) followed by three to five prolines (Pro) residues, which are hydroxylated as hydroxyproline (Hyp) and glycosylated. Some EXTs have Tyrosine (Tyr)-X-Tyr (where X can be any amino acid) motifs that are responsible for intramolecular or intermolecular cross-linkings. EXTs can be divided into several classes: classical EXTs, short EXTs, leucine-rich repeat extensins (LRXs), proline-rich extensin-like receptor kinases (PERKs), formin-homolog EXTs (FH EXTs), chimeric EXTs, and long chimeric EXTs. To guide future research on the EXTs and understand evolutionary history of EXTs in the plant kingdom, a bioinformatics study was conducted to identify and classify EXTs from 16 fully sequenced plant genomes, including Ostreococcus lucimarinus, Chlamydomonas reinhardtii, Volvox carteri, Klebsormidium flaccidum, Physcomitrella patens, Selaginella moellendorffii, Pinus taeda, Picea abies, Brachypodium distachyon, Zea mays, Oryza sativa, Glycine max, Medicago truncatula, Brassica rapa, Solanum lycopersicum, and Solanum tuberosum, to supplement data previously obtained from Arabidopsis thaliana and Populus trichocarpa. A total of 758 EXTs were newly identified, including 87 classical EXTs, 97 short EXTs, 61 LRXs, 75 PERKs, 54 FH EXTs, 38 long chimeric EXTs, and 346 other chimeric EXTs. Several notable findings were made: (1) classical EXTs were likely derived after the terrestrialization of plants; (2) LRXs, PERKs, and FHs were derived earlier than classical EXTs; (3) monocots have few classical EXTs; (4) Eudicots have the greatest number of classical EXTs and Tyr-X-Tyr cross-linking motifs are predominantly in classical EXTs; (5) green algae have no classical EXTs but have a number of long chimeric EXTs that are absent in

  10. Integrated Bioinformatics, Environmental Epidemiologic and Genomic Approaches to Identify Environmental and Molecular Links between Endometriosis and Breast Cancer

    Directory of Open Access Journals (Sweden)

    Deodutta Roy

    2015-10-01

    Full Text Available We present a combined environmental epidemiologic, genomic, and bioinformatics approach to identify: exposure of environmental chemicals with estrogenic activity; epidemiologic association between endocrine disrupting chemical (EDC and health effects, such as, breast cancer or endometriosis; and gene-EDC interactions and disease associations. Human exposure measurement and modeling confirmed estrogenic activity of three selected class of environmental chemicals, polychlorinated biphenyls (PCBs, bisphenols (BPs, and phthalates. Meta-analysis showed that PCBs exposure, not Bisphenol A (BPA and phthalates, increased the summary odds ratio for breast cancer and endometriosis. Bioinformatics analysis of gene-EDC interactions and disease associations identified several hundred genes that were altered by exposure to PCBs, phthalate or BPA. EDCs-modified genes in breast neoplasms and endometriosis are part of steroid hormone signaling and inflammation pathways. All three EDCs–PCB 153, phthalates, and BPA influenced five common genes—CYP19A1, EGFR, ESR2, FOS, and IGF1—in breast cancer as well as in endometriosis. These genes are environmentally and estrogen responsive, altered in human breast and uterine tumors and endometriosis lesions, and part of Mitogen Activated Protein Kinase (MAPK signaling pathways in cancer. Our findings suggest that breast cancer and endometriosis share some common environmental and molecular risk factors.

  11. Workflows in bioinformatics: meta-analysis and prototype implementation of a workflow generator

    Directory of Open Access Journals (Sweden)

    Thoraval Samuel

    2005-04-01

    Full Text Available Abstract Background Computational methods for problem solving need to interleave information access and algorithm execution in a problem-specific workflow. The structures of these workflows are defined by a scaffold of syntactic, semantic and algebraic objects capable of representing them. Despite the proliferation of GUIs (Graphic User Interfaces in bioinformatics, only some of them provide workflow capabilities; surprisingly, no meta-analysis of workflow operators and components in bioinformatics has been reported. Results We present a set of syntactic components and algebraic operators capable of representing analytical workflows in bioinformatics. Iteration, recursion, the use of conditional statements, and management of suspend/resume tasks have traditionally been implemented on an ad hoc basis and hard-coded; by having these operators properly defined it is possible to use and parameterize them as generic re-usable components. To illustrate how these operations can be orchestrated, we present GPIPE, a prototype graphic pipeline generator for PISE that allows the definition of a pipeline, parameterization of its component methods, and storage of metadata in XML formats. This implementation goes beyond the macro capacities currently in PISE. As the entire analysis protocol is defined in XML, a complete bioinformatic experiment (linked sets of methods, parameters and results can be reproduced or shared among users. Availability: http://if-web1.imb.uq.edu.au/Pise/5.a/gpipe.html (interactive, ftp://ftp.pasteur.fr/pub/GenSoft/unix/misc/Pise/ (download. Conclusion From our meta-analysis we have identified syntactic structures and algebraic operators common to many workflows in bioinformatics. The workflow components and algebraic operators can be assimilated into re-usable software components. GPIPE, a prototype implementation of this framework, provides a GUI builder to facilitate the generation of workflows and integration of heterogeneous

  12. Proof of concept: A bioinformatic and serological screening method for identifying new peptide antigens for Chlamydia trachomatis related sequelae in women☆

    OpenAIRE

    Stansfield, Scott H.; Patel, Pooja; Debattista, Joseph; Charles W Armitage; Cunningham, Kelly; Timms, Peter; Allan, John; Mittal, Aruna; Huston, Wilhelmina M.

    2013-01-01

    This study aimed to identify new peptide antigens from Chlamydia (C.) trachomatis in a proof of concept approach which could be used to develop an epitope-based serological diagnostic for C. trachomatis related infertility in women. A bioinformatics analysis was conducted examining several immunodominant proteins from C. trachomatis to identify predicted immunoglobulin epitopes unique to C. trachomatis. A peptide array of these epitopes was screened against participant sera. The participants ...

  13. ISEV position paper: extracellular vesicle RNA analysis and bioinformatics

    Directory of Open Access Journals (Sweden)

    Andrew F. Hill

    2013-12-01

    Full Text Available Extracellular vesicles (EVs are the collective term for the various vesicles that are released by cells into the extracellular space. Such vesicles include exosomes and microvesicles, which vary by their size and/or protein and genetic cargo. With the discovery that EVs contain genetic material in the form of RNA (evRNA has come the increased interest in these vesicles for their potential use as sources of disease biomarkers and potential therapeutic agents. Rapid developments in the availability of deep sequencing technologies have enabled the study of EV-related RNA in detail. In October 2012, the International Society for Extracellular Vesicles (ISEV held a workshop on “evRNA analysis and bioinformatics.” Here, we report the conclusions of one of the roundtable discussions where we discussed evRNA analysis technologies and provide some guidelines to researchers in the field to consider when performing such analysis.

  14. Bioinformatics approaches to single-cell analysis in developmental biology.

    Science.gov (United States)

    Yalcin, Dicle; Hakguder, Zeynep M; Otu, Hasan H

    2016-03-01

    Individual cells within the same population show various degrees of heterogeneity, which may be better handled with single-cell analysis to address biological and clinical questions. Single-cell analysis is especially important in developmental biology as subtle spatial and temporal differences in cells have significant associations with cell fate decisions during differentiation and with the description of a particular state of a cell exhibiting an aberrant phenotype. Biotechnological advances, especially in the area of microfluidics, have led to a robust, massively parallel and multi-dimensional capturing, sorting, and lysis of single-cells and amplification of related macromolecules, which have enabled the use of imaging and omics techniques on single cells. There have been improvements in computational single-cell image analysis in developmental biology regarding feature extraction, segmentation, image enhancement and machine learning, handling limitations of optical resolution to gain new perspectives from the raw microscopy images. Omics approaches, such as transcriptomics, genomics and epigenomics, targeting gene and small RNA expression, single nucleotide and structural variations and methylation and histone modifications, rely heavily on high-throughput sequencing technologies. Although there are well-established bioinformatics methods for analysis of sequence data, there are limited bioinformatics approaches which address experimental design, sample size considerations, amplification bias, normalization, differential expression, coverage, clustering and classification issues, specifically applied at the single-cell level. In this review, we summarize biological and technological advancements, discuss challenges faced in the aforementioned data acquisition and analysis issues and present future prospects for application of single-cell analyses to developmental biology. PMID:26358759

  15. Detecting evolution of bioinformatics with a content and co-authorship analysis.

    Science.gov (United States)

    Song, Min; Yang, Christopher C; Tang, Xuning

    2013-12-01

    Bioinformatics is an interdisciplinary research field that applies advanced computational techniques to biological data. Bibliometrics analysis has recently been adopted to understand the knowledge structure of a research field by citation pattern. In this paper, we explore the knowledge structure of Bioinformatics from the perspective of a core open access Bioinformatics journal, BMC Bioinformatics with trend analysis, the content and co-authorship network similarity, and principal component analysis. Publications in four core journals including Bioinformatics - Oxford Journal and four conferences in Bioinformatics were harvested from DBLP. After converting publications into TF-IDF term vectors, we calculate the content similarity, and we also calculate the social network similarity based on the co-authorship network by utilizing the overlap measure between two co-authorship networks. Key terms is extracted and analyzed with PCA, visualization of the co-authorship network is conducted. The experimental results show that Bioinformatics is fast-growing, dynamic and diversified. The content analysis shows that there is an increasing overlap among Bioinformatics journals in terms of topics and more research groups participate in researching Bioinformatics according to the co-authorship network similarity. PMID:23710427

  16. The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis.

    OpenAIRE

    Alva, V.; Nam, S.; Söding, J.; Lupas, A.

    2016-01-01

    The MPI Bioinformatics Toolkit (http://toolkit.tuebingen.mpg.de) is an open, interactive web service for comprehensive and collaborative protein bioinformatic analysis. It offers a wide array of interconnected, state-of-the-art bioinformatics tools to experts and non-experts alike, developed both externally (e.g. BLAST+, HMMER3, MUSCLE) and internally (e.g. HHpred, HHblits, PCOILS). While a beta version of the Toolkit was released 10 years ago, the current production-level release has been av...

  17. Integrated Bioinformatics, Environmental Epidemiologic and Genomic Approaches to Identify Environmental and Molecular Links between Endometriosis and Breast Cancer

    OpenAIRE

    Deodutta Roy; Marisa Morgan; Changwon Yoo; Alok Deoraj; Sandhya Roy; Vijay Kumar Yadav; Mohannad Garoub; Hamza Assaggaf; Mayur Doke

    2015-01-01

    We present a combined environmental epidemiologic, genomic, and bioinformatics approach to identify: exposure of environmental chemicals with estrogenic activity; epidemiologic association between endocrine disrupting chemical (EDC) and health effects, such as, breast cancer or endometriosis; and gene-EDC interactions and disease associations. Human exposure measurement and modeling confirmed estrogenic activity of three selected class of environmental chemicals, polychlorinated biphenyls (PC...

  18. Proof of concept: A bioinformatic and serological screening method for identifying new peptide antigens for Chlamydia trachomatis related sequelae in women.

    Science.gov (United States)

    Stansfield, Scott H; Patel, Pooja; Debattista, Joseph; Armitage, Charles W; Cunningham, Kelly; Timms, Peter; Allan, John; Mittal, Aruna; Huston, Wilhelmina M

    2013-01-01

    This study aimed to identify new peptide antigens from Chlamydia (C.) trachomatis in a proof of concept approach which could be used to develop an epitope-based serological diagnostic for C. trachomatis related infertility in women. A bioinformatics analysis was conducted examining several immunodominant proteins from C. trachomatis to identify predicted immunoglobulin epitopes unique to C. trachomatis. A peptide array of these epitopes was screened against participant sera. The participants (all female) were categorized into the following cohorts based on their infection and gynecological history; acute (single treated infection with C. trachomatis), multiple (more than one C. trachomatis infection, all treated), sequelae (PID or tubal infertility with a history of C. trachomatis infection), and infertile (no history of C. trachomatis infection and no detected tubal damage). The bioinformatics strategy identified several promising epitopes. Participants who reacted positively in the peptide 11 ELISA were found to have an increased likelihood of being in the sequelae cohort compared to the infertile cohort with an odds ratio of 16.3 (95% c.i. 1.65-160), with 95% specificity and 46% sensitivity (0.19-0.74). The peptide 11 ELISA has the potential to be further developed as a screening tool for use during the early IVF work up and provides proof of concept that there may be further peptide antigens which could be identified using bioinformatics and screening approaches. PMID:24600556

  19. Proof of concept: A bioinformatic and serological screening method for identifying new peptide antigens for Chlamydia trachomatis related sequelae in women☆

    Science.gov (United States)

    Stansfield, Scott H.; Patel, Pooja; Debattista, Joseph; Armitage, Charles W.; Cunningham, Kelly; Timms, Peter; Allan, John; Mittal, Aruna; Huston, Wilhelmina M.

    2013-01-01

    This study aimed to identify new peptide antigens from Chlamydia (C.) trachomatis in a proof of concept approach which could be used to develop an epitope-based serological diagnostic for C. trachomatis related infertility in women. A bioinformatics analysis was conducted examining several immunodominant proteins from C. trachomatis to identify predicted immunoglobulin epitopes unique to C. trachomatis. A peptide array of these epitopes was screened against participant sera. The participants (all female) were categorized into the following cohorts based on their infection and gynecological history; acute (single treated infection with C. trachomatis), multiple (more than one C. trachomatis infection, all treated), sequelae (PID or tubal infertility with a history of C. trachomatis infection), and infertile (no history of C. trachomatis infection and no detected tubal damage). The bioinformatics strategy identified several promising epitopes. Participants who reacted positively in the peptide 11 ELISA were found to have an increased likelihood of being in the sequelae cohort compared to the infertile cohort with an odds ratio of 16.3 (95% c.i. 1.65–160), with 95% specificity and 46% sensitivity (0.19–0.74). The peptide 11 ELISA has the potential to be further developed as a screening tool for use during the early IVF work up and provides proof of concept that there may be further peptide antigens which could be identified using bioinformatics and screening approaches. PMID:24600556

  20. A complementary bioinformatics approach to identify potential plant cell wall glycosyltransferase-encoding genes.

    Science.gov (United States)

    Egelund, Jack; Skjøt, Michael; Geshi, Naomi; Ulvskov, Peter; Petersen, Bent Larsen

    2004-09-01

    Plant cell wall (CW) synthesizing enzymes can be divided into the glycan (i.e. cellulose and callose) synthases, which are multimembrane spanning proteins located at the plasma membrane, and the glycosyltransferases (GTs), which are Golgi localized single membrane spanning proteins, believed to participate in the synthesis of hemicellulose, pectin, mannans, and various glycoproteins. At the Carbohydrate-Active enZYmes (CAZy) database where e.g. glucoside hydrolases and GTs are classified into gene families primarily based on amino acid sequence similarities, 415 Arabidopsis GTs have been classified. Although much is known with regard to composition and fine structures of the plant CW, only a handful of CW biosynthetic GT genes-all classified in the CAZy system-have been characterized. In an effort to identify CW GTs that have not yet been classified in the CAZy database, a simple bioinformatics approach was adopted. First, the entire Arabidopsis proteome was run through the Transmembrane Hidden Markov Model 2.0 server and proteins containing one or, more rarely, two transmembrane domains within the N-terminal 150 amino acids were collected. Second, these sequences were submitted to the SUPERFAMILY prediction server, and sequences that were predicted to belong to the superfamilies NDP-sugartransferase, UDP-glycosyltransferase/glucogen-phosphorylase, carbohydrate-binding domain, Gal-binding domain, or Rossman fold were collected, yielding a total of 191 sequences. Fifty-two accessions already classified in CAZy were discarded. The resulting 139 sequences were then analyzed using the Three-Dimensional-Position-Specific Scoring Matrix and mGenTHREADER servers, and 27 sequences with similarity to either the GT-A or the GT-B fold were obtained. Proof of concept of the present approach has to some extent been provided by our recent demonstration that two members of this pool of 27 non-CAZy-classified putative GTs are xylosyltransferases involved in synthesis of pectin

  1. ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis.

    Science.gov (United States)

    Römer, Michael; Eichner, Johannes; Dräger, Andreas; Wrzodek, Clemens; Wrzodek, Finja; Zell, Andreas

    2016-01-01

    Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT) Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/. PMID:26882475

  2. ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis.

    Directory of Open Access Journals (Sweden)

    Michael Römer

    Full Text Available Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/.

  3. Somatic populations of PGT135-137 HIV-1-neutralizing antibodies identified by 454 pyrosequencing and bioinformatics

    Directory of Open Access Journals (Sweden)

    Jiang eZhu

    2012-09-01

    Full Text Available Select HIV-1-infected individuals develop sera capable of neutralizing diverse viral strains. The molecular basis of this neutralization is currently being deciphered by the isolation of HIV-1-neutralizing antibodies. In one infected donor, three neutralizing antibodies, PGT135-137, were identified by assessment of neutralization from individually sorted B cells and found to recognize an epitope containing an N-linked glycan at residue 332 on HIV-1 gp120. Here we use deep sequencing and bioinformatics methods to interrogate the B cell record of this donor to gain a more complete understanding of the humoral immune response. PGT135-137-gene family-specific primers were used to amplify heavy and light chain-variable domain sequences. 454 pyrosequencing produced 141,298 heavy-chain sequences of IGHV4-39 origin and 87,229 light-chain sequences of IGKV3-15 origin. A number of heavy and light chain sequences of ~90% identity to PGT137, several to PGT136, and none of high identity to PGT135 were identified. After expansion of these sequences to include close phylogenetic relatives, a total of 202 heavy-chain sequences and 72 light-chain sequences were identified. These sequences were clustered into populations of 95% identity comprising 15 for heavy chain and 10 for light chain, and a select sequence from each population was synthesized and reconstituted with a PGT137-partner chain. Reconstituted antibodies showed varied neutralization phenotypes for HIV-1 clade A and D isolates. Sequence diversity of the antibody population represented by these tested sequences was notably higher than observed with a 454 pyrosequencing-control analysis on 10 antibodies of defined sequence, suggesting that this diversity results primarily from somatic maturation. Our results thus provide an example of how pathogens like HIV-1 are opposed by a varied humoral immune response, derived from intrinsic mechanisms of antibody development, and embodied by somatic populations

  4. Buying in to bioinformatics: an introduction to commercial sequence analysis software.

    Science.gov (United States)

    Smith, David Roy

    2015-07-01

    Advancements in high-throughput nucleotide sequencing techniques have brought with them state-of-the-art bioinformatics programs and software packages. Given the importance of molecular sequence data in contemporary life science research, these software suites are becoming an essential component of many labs and classrooms, and as such are frequently designed for non-computer specialists and marketed as one-stop bioinformatics toolkits. Although beautifully designed and powerful, user-friendly bioinformatics packages can be expensive and, as more arrive on the market each year, it can be difficult for researchers, teachers and students to choose the right software for their needs, especially if they do not have a bioinformatics background. This review highlights some of the currently available and most popular commercial bioinformatics packages, discussing their prices, usability, features and suitability for teaching. Although several commercial bioinformatics programs are arguably overpriced and overhyped, many are well designed, sophisticated and, in my opinion, worth the investment. If you are just beginning your foray into molecular sequence analysis or an experienced genomicist, I encourage you to explore proprietary software bundles. They have the potential to streamline your research, increase your productivity, energize your classroom and, if anything, add a bit of zest to the often dry detached world of bioinformatics. PMID:25183247

  5. Will solid-state drives accelerate your bioinformatics? In-depth profiling, performance analysis and beyond.

    Science.gov (United States)

    Lee, Sungmin; Min, Hyeyoung; Yoon, Sungroh

    2016-07-01

    A wide variety of large-scale data have been produced in bioinformatics. In response, the need for efficient handling of biomedical big data has been partly met by parallel computing. However, the time demand of many bioinformatics programs still remains high for large-scale practical uses because of factors that hinder acceleration by parallelization. Recently, new generations of storage devices have emerged, such as NAND flash-based solid-state drives (SSDs), and with the renewed interest in near-data processing, they are increasingly becoming acceleration methods that can accompany parallel processing. In certain cases, a simple drop-in replacement of hard disk drives by SSDs results in dramatic speedup. Despite the various advantages and continuous cost reduction of SSDs, there has been little review of SSD-based profiling and performance exploration of important but time-consuming bioinformatics programs. For an informative review, we perform in-depth profiling and analysis of 23 key bioinformatics programs using multiple types of devices. Based on the insight we obtain from this research, we further discuss issues related to design and optimize bioinformatics algorithms and pipelines to fully exploit SSDs. The programs we profile cover traditional and emerging areas of importance, such as alignment, assembly, mapping, expression analysis, variant calling and metagenomics. We explain how acceleration by parallelization can be combined with SSDs for improved performance and also how using SSDs can expedite important bioinformatics pipelines, such as variant calling by the Genome Analysis Toolkit and transcriptome analysis using RNA sequencing. We hope that this review can provide useful directions and tips to accompany future bioinformatics algorithm design procedures that properly consider new generations of powerful storage devices. PMID:26330577

  6. Immunoproteomic and bioinformatic approaches to identify secreted Leishmania amazonensis, L. braziliensis, and L. infantum proteins with specific reactivity using canine serum.

    Science.gov (United States)

    Lima, B S S; Fialho, L C; Pires, S F; Tafuri, W L; Andrade, H M

    2016-06-15

    Leishmania spp have a wide range of hosts, and each host can harbor several Leishmania species. Dogs, for example, are frequently infected by Leishmania infantum, where they constitute its main reservoir, but they also serve as hosts for L. braziliensis and L. amazonensis. Serological tests for antibody detection are valuable tools for diagnosis of L. infantum infection due to the high levels of antibodies induced, unlike what is observed in L. amazonensis and L. braziliensis infections. Likewise, serology-based antigen-detection can be useful as an approach to diagnose any Leishmania species infection using different corporal fluid samples. Immunogenic and secreted proteins constitute powerful targets for diagnostic methods in antigen detection. As such, we performed immunoproteomic (2-DE, western blot and mass spectrometry) and bioinformatic screening to search for reactive and secreted proteins from L. amazonensis, L. braziliensis, and L. infantum. Twenty-eight non-redundant proteins were identified, among which, six were reactive only in L. amazonensis extracts, 10 in L. braziliensis extracts, and seven in L. infantum extracts. After bioinformatic analysis, seven proteins were predicted to be secreted, two of which were reactive only in L. amazonensis extracts (52kDa PDI and the glucose-regulated protein 78), one in L. braziliensis extracts (pyruvate dehydrogenase E1 beta subunit) and three in L. infantum extracts (two conserved hypothetical proteins and elongation factor 1-beta). We propose that proteins can be suitable targets for diagnostic methods based on antigen detection. PMID:27198787

  7. The Revolution in Viral Genomics as Exemplified by the Bioinformatic Analysis of Human Adenoviruses

    Directory of Open Access Journals (Sweden)

    Sarah Torres

    2010-06-01

    Full Text Available Over the past 30 years, genomic and bioinformatic analysis of human adenoviruses has been achieved using a variety of DNA sequencing methods; initially with the use of restriction enzymes and more currently with the use of the GS FLX pyrosequencing technology. Following the conception of DNA sequencing in the 1970s, analysis of adenoviruses has evolved from 100 base pair mRNA fragments to entire genomes. Comparative genomics of adenoviruses made its debut in 1984 when nucleotides and amino acids of coding sequences within the hexon genes of two human adenoviruses (HAdV, HAdV–C2 and HAdV–C5, were compared and analyzed. It was determined that there were three different zones (1-393, 394-1410, 1411-2910 within the hexon gene, of which HAdV–C2 and HAdV–C5 shared zones 1 and 3 with 95% and 89.5% nucleotide identity, respectively. In 1992, HAdV-C5 became the first adenovirus genome to be fully sequenced using the Sanger method. Over the next seven years, whole genome analysis and characterization was completed using bioinformatic tools such as blastn, tblastx, ClustalV and FASTA, in order to determine key proteins in species HAdV-A through HAdV-F. The bioinformatic revolution was initiated with the introduction of a novel species, HAdV-G, that was typed and named by the use of whole genome sequencing and phylogenetics as opposed to traditional serology. HAdV bioinformatics will continue to advance as the latest sequencing technology enables scientists to add to and expand the resource databases. As a result of these advancements, how novel HAdVs are typed has changed. Bioinformatic analysis has become the revolutionary tool that has significantly accelerated the in-depth study of HAdV microevolution through comparative genomics.

  8. The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis.

    Science.gov (United States)

    Alva, Vikram; Nam, Seung-Zin; Söding, Johannes; Lupas, Andrei N

    2016-07-01

    The MPI Bioinformatics Toolkit (http://toolkit.tuebingen.mpg.de) is an open, interactive web service for comprehensive and collaborative protein bioinformatic analysis. It offers a wide array of interconnected, state-of-the-art bioinformatics tools to experts and non-experts alike, developed both externally (e.g. BLAST+, HMMER3, MUSCLE) and internally (e.g. HHpred, HHblits, PCOILS). While a beta version of the Toolkit was released 10 years ago, the current production-level release has been available since 2008 and has serviced more than 1.6 million external user queries. The usage of the Toolkit has continued to increase linearly over the years, reaching more than 400 000 queries in 2015. In fact, through the breadth of its tools and their tight interconnection, the Toolkit has become an excellent platform for experimental scientists as well as a useful resource for teaching bioinformatic inquiry to students in the life sciences. In this article, we report on the evolution of the Toolkit over the last ten years, focusing on the expansion of the tool repertoire (e.g. CS-BLAST, HHblits) and on infrastructural work needed to remain operative in a changing web environment. PMID:27131380

  9. Bioinformatic Analysis of Putative Gene Products Encoded in SARS-HCoV Genome

    Institute of Scientific and Technical Information of China (English)

    赵心刚; 韩敬东; 宁元亨; 孟安明; 陈晔光

    2003-01-01

    The cause of severe acute respiratory syndrome (SARS) has been identified as a new coronavirus named as SARS-HCoV.Using bioinformatic methods, we have performed a detailed domain search.In addition to the viral structure proteins, we have found that several putative polypeptides share sequence similarity to known domains or proteins.This study may provide a basis for future studies on the infection and replication process of this notorious virus.

  10. Proteomic and bioinformatic analysis of epithelial tight junction reveals an unexpected cluster of synaptic molecules

    Directory of Open Access Journals (Sweden)

    Tang Vivian W

    2006-12-01

    Full Text Available Abstract Background Zonula occludens, also known as the tight junction, is a specialized cell-cell interaction characterized by membrane "kisses" between epithelial cells. A cytoplasmic plaque of ~100 nm corresponding to a meshwork of densely packed proteins underlies the tight junction membrane domain. Due to its enormous size and difficulties in obtaining a biochemically pure fraction, the molecular composition of the tight junction remains largely unknown. Results A novel biochemical purification protocol has been developed to isolate tight junction protein complexes from cultured human epithelial cells. After identification of proteins by mass spectroscopy and fingerprint analysis, candidate proteins are scored and assessed individually. A simple algorithm has been devised to incorporate transmembrane domains and protein modification sites for scoring membrane proteins. Using this new scoring system, a total of 912 proteins have been identified. These 912 hits are analyzed using a bioinformatics approach to bin the hits in 4 categories: configuration, molecular function, cellular function, and specialized process. Prominent clusters of proteins related to the cytoskeleton, cell adhesion, and vesicular traffic have been identified. Weaker clusters of proteins associated with cell growth, cell migration, translation, and transcription are also found. However, the strongest clusters belong to synaptic proteins and signaling molecules. Localization studies of key components of synaptic transmission have confirmed the presence of both presynaptic and postsynaptic proteins at the tight junction domain. To correlate proteomics data with structure, the tight junction has been examined using electron microscopy. This has revealed many novel structures including end-on cytoskeletal attachments, vesicles fusing/budding at the tight junction membrane domain, secreted substances encased between the tight junction kisses, endocytosis of tight junction

  11. Identification of microRNAs from Eugenia uniflora by high-throughput sequencing and bioinformatics analysis.

    Directory of Open Access Journals (Sweden)

    Frank Guzman

    Full Text Available BACKGROUND: microRNAs or miRNAs are small non-coding regulatory RNAs that play important functions in the regulation of gene expression at the post-transcriptional level by targeting mRNAs for degradation or inhibiting protein translation. Eugenia uniflora is a plant native to tropical America with pharmacological and ecological importance, and there have been no previous studies concerning its gene expression and regulation. To date, no miRNAs have been reported in Myrtaceae species. RESULTS: Small RNA and RNA-seq libraries were constructed to identify miRNAs and pre-miRNAs in Eugenia uniflora. Solexa technology was used to perform high throughput sequencing of the library, and the data obtained were analyzed using bioinformatics tools. From 14,489,131 small RNA clean reads, we obtained 1,852,722 mature miRNA sequences representing 45 conserved families that have been identified in other plant species. Further analysis using contigs assembled from RNA-seq allowed the prediction of secondary structures of 25 known and 17 novel pre-miRNAs. The expression of twenty-seven identified miRNAs was also validated using RT-PCR assays. Potential targets were predicted for the most abundant mature miRNAs in the identified pre-miRNAs based on sequence homology. CONCLUSIONS: This study is the first large scale identification of miRNAs and their potential targets from a species of the Myrtaceae family without genomic sequence resources. Our study provides more information about the evolutionary conservation of the regulatory network of miRNAs in plants and highlights species-specific miRNAs.

  12. Bioinformatics analysis of differentially expressed proteins in prostate cancer based on proteomics data

    Directory of Open Access Journals (Sweden)

    Chen C

    2016-03-01

    Full Text Available Chen Chen,1 Li-Guo Zhang,1 Jian Liu,1 Hui Han,1 Ning Chen,1 An-Liang Yao,1 Shao-San Kang,1 Wei-Xing Gao,1 Hong Shen,2 Long-Jun Zhang,1 Ya-Peng Li,1 Feng-Hong Cao,1 Zhi-Guo Li3 1Department of Urology, North China University of Science and Technology Affiliated Hospital, 2Department of Modern Technology and Education Center, 3Department of Medical Research Center, International Science and Technology Cooperation Base of Geriatric Medicine, North China University of Science and Technology, Tangshan, People’s Republic of China Abstract: We mined the literature for proteomics data to examine the occurrence and metastasis of prostate cancer (PCa through a bioinformatics analysis. We divided the differentially expressed proteins (DEPs into two groups: the group consisting of PCa and benign tissues (P&b and the group presenting both high and low PCa metastatic tendencies (H&L. In the P&b group, we found 320 DEPs, 20 of which were reported more than three times, and DES was the most commonly reported. Among these DEPs, the expression levels of FGG, GSN, SERPINC1, TPM1, and TUBB4B have not yet been correlated with PCa. In the H&L group, we identified 353 DEPs, 13 of which were reported more than three times. Among these DEPs, MDH2 and MYH9 have not yet been correlated with PCa metastasis. We further confirmed that DES was differentially expressed between 30 cancer and 30 benign tissues. In addition, DEPs associated with protein transport, regulation of actin cytoskeleton, and the extracellular matrix (ECM–receptor interaction pathway were prevalent in the H&L group and have not yet been studied in detail in this context. Proteins related to homeostasis, the wound-healing response, focal adhesions, and the complement and coagulation pathways were overrepresented in both groups. Our findings suggest that the repeatedly reported DEPs in the two groups may function as potential biomarkers for detecting PCa and predicting its aggressiveness. Furthermore

  13. R/parallel - speeding up bioinformatics analysis with R

    NARCIS (Netherlands)

    Vera, Gonzalo; Jansen, Ritsert C.; Suppi, Remo L.

    2008-01-01

    Background: R is the preferred tool for statistical analysis of many bioinformaticians due in part to the increasing number of freely available analytical methods. Such methods can be quickly reused and adapted to each particular experiment. However, in experiments where large amounts of data are ge

  14. Statistics and bioinformatics in nutritional sciences: analysis of complex data in the era of systems biology⋆

    OpenAIRE

    Fu, Wenjiang J; Stromberg, Arnold J; Viele, Kert; Carroll, Raymond J.; Wu, Guoyao

    2010-01-01

    Over the past two decades, there have been revolutionary developments in life science technologies characterized by high throughput, high efficiency, and rapid computation. Nutritionists now have the advanced methodologies for the analysis of DNA, RNA, protein, low-molecular-weight metabolites, as well as access to bioinformatics databases. Statistics, which can be defined as the process of making scientific inferences from data that contain variability, has historically played an integral ro...

  15. Bioinformatics and biomarker discovery "Omic" data analysis for personalized medicine

    CERN Document Server

    Azuaje, Francisco

    2010-01-01

    This book is designed to introduce biologists, clinicians and computational researchers to fundamental data analysis principles, techniques and tools for supporting the discovery of biomarkers and the implementation of diagnostic/prognostic systems. The focus of the book is on how fundamental statistical and data mining approaches can support biomarker discovery and evaluation, emphasising applications based on different types of "omic" data. The book also discusses design factors, requirements and techniques for disease screening, diagnostic and prognostic applications. Readers are provided w

  16. A new bioinformatics analysis tools framework at EMBL–EBI

    OpenAIRE

    Goujon, Mickael; McWilliam, Hamish; Li, Weizhong; Valentin, Franck; Squizzato, Silvano; Paern, Juri; Lopez, Rodrigo

    2010-01-01

    The EMBL-EBI provides access to various mainstream sequence analysis applications. These include sequence similarity search services such as BLAST, FASTA, InterProScan and multiple sequence alignment tools such as ClustalW, T-Coffee and MUSCLE. Through the sequence similarity search services, the users can search mainstream sequence databases such as EMBL-Bank and UniProt, and more than 2000 completed genomes and proteomes. We present here a new framework aimed at both novice as well as exper...

  17. CRISPRTarget: bioinformatic prediction and analysis of crRNA targets.

    Science.gov (United States)

    Biswas, Ambarish; Gagnon, Joshua N; Brouns, Stan J J; Fineran, Peter C; Brown, Chris M

    2013-05-01

    The bacterial and archaeal CRISPR/Cas adaptive immune system targets specific protospacer nucleotide sequences in invading organisms. This requires base pairing between processed CRISPR RNA and the target protospacer. For type I and II CRISPR/Cas systems, protospacer adjacent motifs (PAM) are essential for target recognition, and for type III, mismatches in the flanking sequences are important in the antiviral response. In this study, we examine the properties of each class of CRISPR. We use this information to provide a tool (CRISPRTarget) that predicts the most likely targets of CRISPR RNAs (http://bioanalysis.otago.ac.nz/CRISPRTarget). This can be used to discover targets in newly sequenced genomic or metagenomic data. To test its utility, we discover features and targets of well-characterized Streptococcus thermophilus and Sulfolobus solfataricus type II and III CRISPR/Cas systems. Finally, in Pectobacterium species, we identify new CRISPR targets and propose a model of temperate phage exposure and subsequent inhibition by the type I CRISPR/Cas systems. PMID:23492433

  18. Dysregulation of TFDP1 and of the cell cycle pathway in high-grade glioblastoma multiforme: a bioinformatic analysis.

    Science.gov (United States)

    Lu, X; Lv, X D; Ren, Y H; Yang, W D; Li, Z B; Zhang, L; Bai, X F

    2016-01-01

    Despite extensive research, the prognosis of high-grade glioblastoma multiforme (GBM) has improved only slightly because of the limited response to standard treatments. Recent advances (discoveries of molecular biomarkers) provide new opportunities for the treatment of GBM. The aim of the present study was to identify diagnostic biomarkers of high-grade GBM. First, we combined 3 microarray expression datasets to screen them for genes differentially expressed in patients with high-grade GBM relative to healthy subjects. Next, the target network was constructed via the empirical Bayesian coexpression approach, and centrality analysis and a molecular complex detection (MCODE) algorithm were performed to explore hub genes and functional modules. Finally, a validation test was conducted to verify the bioinformatic results. A total of 277 differentially expressed genes were identified according to the criteria P < 0.05 and |log2(fold change)| ≥ 1.5. These genes were most significantly enriched in the cell cycle pathway. Centrality analysis uncovered 9 hub genes; among them, TFDP1 showed the highest degree of connectivity (43) and is a known participant in the cell cycle pathway; this finding pointed to the important role of TFDP1 in the progression of high-grade GBM. Experimental validation mostly supported the bioinformatic results. According to our study results, the gene TFDP1 and the cell cycle pathway are strongly associated with high-grade GBM; this result may provide new insights into the pathogenesis of GBM. PMID:27323154

  19. Bioinformatics analysis of rabbit haemorrhagic disease virus genome

    Directory of Open Access Journals (Sweden)

    Liu Ji-xing

    2011-11-01

    Full Text Available Abstract Background Rabbit haemorrhagic disease virus (RHDV, as the pathogeny of Rabbit haemorrhagic disease, can cause a highly infectious and often fatal disease only affecting wild and domestic rabbits. Recent researches revealed that it, as one number of the Caliciviridae, has some specialties in its genome, its reproduction and so on. Results In this report, we firstly analyzed its genome and two open reading frameworks (ORFs from this aspect of codon usage bias. Our researches indicated that mutation pressure rather than natural is the most important determinant in RHDV with high codon bias, and the codon usage bias is nearly contrary between ORF1 and ORF2, which is maybe one of factors regulating the expression of VP60 (encoding by ORF1 and VP10 (encoding by ORF2. Furthermore, negative selective constraints on the RHDV whole genome implied that VP10 played an important role in RHDV lifecycle. Conclusions We conjectured that VP10 might be beneficial for the replication, release or both of virus by inducing infected cell apoptosis initiate by RHDV. According to the results of the principal component analysis for ORF2 of RSCU, we firstly separated 30 RHDV into two genotypes, and the ENC values indicated ORF1 and ORF2 were independent among the evolution of RHDV.

  20. Quantitative Analysis of the Trends Exhibited by the Three Interdisciplinary Biological Sciences: Biophysics, Bioinformatics, and Systems Biology

    Directory of Open Access Journals (Sweden)

    Jonghoon Kang

    2015-08-01

    Full Text Available New interdisciplinary biological sciences like bioinformatics, biophysics, and systems biology have become increasingly relevant in modern science. Many papers have suggested the importance of adding these subjects, particularly bioinformatics, to an undergraduate curriculum; however, most of their assertions have relied on qualitative arguments. In this paper, we will show our metadata analysis of a scientific literature database (PubMed that quantitatively describes the importance of the subjects of bioinformatics, systems biology, and biophysics as compared with a well-established interdisciplinary subject, biochemistry. Specifically, we found that the development of each subject assessed by its publication volume was well described by a set of simple nonlinear equations, allowing us to characterize them quantitatively. Bioinformatics, which had the highest ratio of publications produced, was predicted to grow between 77% and 93% by 2025 according to the model. Due to the large number of publications produced in bioinformatics, which nearly matches the number published in biochemistry, it can be inferred that bioinformatics is almost equal in significance to biochemistry. Based on our analysis, we suggest that bioinformatics be added to the standard biology undergraduate curriculum. Adding this course to an undergraduate curriculum will better prepare students for future research in biology.

  1. Emerging bioinformatics approaches for analysis of NGS-derived coding and non-coding RNAs in neurodegenerative diseases

    Directory of Open Access Journals (Sweden)

    Alessandro Guffanti

    2014-03-01

    Full Text Available Neurodegenerative diseases such as late-onset Alzheimer’s disease (LOAD involve a genetically complex and still obscure ensemble of causative and risk factors accompanied by complex feedback responses. The advent of ‘high-throughput’ transcriptome investigation technologies such as deep sequecing is increasingly being combined with statistical and bioinformatics analysis methods complemented by Bayesian Networks or network and graph analyses. Together, such 'integrative' studies are beginning to identify co-regulated gene networks linked with biological pathways and potentially modulating disease predisposition, outcome and progression. Specifically, bioinformatics analyses of integrated microarray and genotyping data in cases and controls reveal changes in gene expression of both protein-coding and small and long regulatory RNAs; highlight relevant quantitative transcriptional differences between LOAD and non-demented control brains and demonstrate reconfiguration of functionally meaningful molecular interaction structures in LOAD. These may be measured as changes in connectivity in ‘hub nodes’ of relevant gene networks. We illustrate here the open analytical questions in the transcriptome investigation of neurodegenerative disease studies, proposing 'ad-hoc' strategies for the evaluation of differential gene expression and hints for a simple analysis of the ncRNA part of such datasets. We then survey the emerging role of long non-coding RNA in the healthy and diseased brain transcriptome and describe the main current methods for computational modeling of gene networks. We propose accessible modular and pathway-oriented methods and guidelines for bioinformatics investigations of whole transcriptome NGS datasets. We finally present methods and databases for functional interpretations of long ncRNAs and propose a simple heuristic approach to visualize and represent physical and and functional interactions of the coding and non

  2. Bioinformatics Tools and Novel Challenges in Long Non-Coding RNAs (lncRNAs Functional Analysis

    Directory of Open Access Journals (Sweden)

    Andrea Masotti

    2011-12-01

    Full Text Available The advent of next generation sequencing revealed that a fraction of transcribed RNAs (short and long RNAs is non-coding. Long non-coding RNAs (lncRNAs have a crucial role in regulating gene expression and in epigenetics (chromatin and histones remodeling. LncRNAs may have different roles: gene activators (signaling, repressors (decoy, cis and trans gene expression regulators (guides and chromatin modificators (scaffolds without the need to be mutually exclusive. LncRNAs are also implicated in a number of diseases. The huge amount of inhomogeneous data produced so far poses several bioinformatics challenges spanning from the simple annotation to the more complex functional annotation. In this review, we report and discuss several bioinformatics resources freely available and dealing with the study of lncRNAs. To our knowledge, this is the first review summarizing all the available bioinformatics resources on lncRNAs appeared in the literature after the completion of the human genome project. Therefore, the aim of this review is to provide a little guide for biologists and bioinformaticians looking for dedicated resources, public repositories and other tools for lncRNAs functional analysis.

  3. Introduction to Bioinformatics

    OpenAIRE

    Thampi, Sabu M.

    2009-01-01

    Bioinformatics is a new discipline that addresses the need to manage and interpret the data that in the past decade was massively generated by genomic research. This discipline represents the convergence of genomics, biotechnology and information technology, and encompasses analysis and interpretation of data, modeling of biological phenomena, and development of algorithms and statistics. This article presents an introduction to bioinformatics

  4. Bio-informatics Research Progress in the Post-genome Era Based on the Quantitative Analysis of SCIE

    Institute of Scientific and Technical Information of China (English)

    Yongqin; ZHAN; Min; YU

    2013-01-01

    SCIE paper output can reflect the status quo and trend of discipline research and 7 038 scientific articles concerning bioinformatics are retrieved in SCIE database during the years between 2008 and 2012. Quantitative analysis of paper output and citation frequency are conducted according to nations, institutions, publications, research direction as well as hot articles, which provides assistance for bioinformatics researchers to understand the present situation of this subject, carry out cooperative studies and display scientific research achievements.

  5. Bioinformatics Identification of Modules of Transcription Factor Binding Sites in Alzheimer's Disease-Related Genes by In Silico Promoter Analysis and Microarrays

    Directory of Open Access Journals (Sweden)

    Regina Augustin

    2011-01-01

    Full Text Available The molecular mechanisms and genetic risk factors underlying Alzheimer's disease (AD pathogenesis are only partly understood. To identify new factors, which may contribute to AD, different approaches are taken including proteomics, genetics, and functional genomics. Here, we used a bioinformatics approach and found that distinct AD-related genes share modules of transcription factor binding sites, suggesting a transcriptional coregulation. To detect additional coregulated genes, which may potentially contribute to AD, we established a new bioinformatics workflow with known multivariate methods like support vector machines, biclustering, and predicted transcription factor binding site modules by using in silico analysis and over 400 expression arrays from human and mouse. Two significant modules are composed of three transcription factor families: CTCF, SP1F, and EGRF/ZBPF, which are conserved between human and mouse APP promoter sequences. The specific combination of in silico promoter and multivariate analysis can identify regulation mechanisms of genes involved in multifactorial diseases.

  6. Applying Instructional Design Theories to Bioinformatics Education in Microarray Analysis and Primer Design Workshops

    Science.gov (United States)

    Shachak, Aviv; Ophir, Ron; Rubin, Eitan

    2005-01-01

    The need to support bioinformatics training has been widely recognized by scientists, industry, and government institutions. However, the discussion of instructional methods for teaching bioinformatics is only beginning. Here we report on a systematic attempt to design two bioinformatics workshops for graduate biology students on the basis of…

  7. Bioinformatics Analysis for Coding SNPs of the HLADQA1 Gene Involved in Susceptibility to Cervical Cancer

    Institute of Scientific and Technical Information of China (English)

    Yanyun Li; Jun Xing; Linsheng Zhao; Yanni Li; Yuchuan Wang; Weiming Zhang

    2006-01-01

    OBJECTIVE To analyze coding SNPs of the HLA-DQA1 gene involved in susceptibility for cervical cancer by a bioinformatics approach, and to choose some SNPs that may have an association with cervical cancer.METHODS By a SNPper tool we extracted SNPs from a public database (dbSNP), exporting them in FASTA formats suitable for subsequent use.Then we used PARSESNP as a tool for the analysis of the cSNPs.RESULTS In the cSNPs of the HLA-DQA1 gene, we find that rs9272693and rs9272703, are made up of missense mutations which convert a codon for one amino acid into a codon for a different amino acid. We chose a PSSM Difference >10 as a lower level for the scores of changes predicted to be deldterious.CONCLUSION We used a bioinformatics approach for cSNPs analysis of the HLA-DQA1 gene. This method can select the variants in a conserved region, and give a PSSM Difference score. But the results need to be verified in cervical cancer patients and a control population.

  8. SweetNET: A Bioinformatics Workflow for Glycopeptide MS/MS Spectral Analysis.

    Science.gov (United States)

    Nasir, Waqas; Toledo, Alejandro Gomez; Noborn, Fredrik; Nilsson, Jonas; Wang, Mingxun; Bandeira, Nuno; Larson, Göran

    2016-08-01

    Glycoproteomics has rapidly become an independent analytical platform bridging the fields of glycomics and proteomics to address site-specific protein glycosylation and its impact in biology. Current glycopeptide characterization relies on time-consuming manual interpretations and demands high levels of personal expertise. Efficient data interpretation constitutes one of the major challenges to be overcome before true high-throughput glycopeptide analysis can be achieved. The development of new glyco-related bioinformatics tools is thus of crucial importance to fulfill this goal. Here we present SweetNET: a data-oriented bioinformatics workflow for efficient analysis of hundreds of thousands of glycopeptide MS/MS-spectra. We have analyzed MS data sets from two separate glycopeptide enrichment protocols targeting sialylated glycopeptides and chondroitin sulfate linkage region glycopeptides, respectively. Molecular networking was performed to organize the glycopeptide MS/MS data based on spectral similarities. The combination of spectral clustering, oxonium ion intensity profiles, and precursor ion m/z shift distributions provided typical signatures for the initial assignment of different N-, O- and CS-glycopeptide classes and their respective glycoforms. These signatures were further used to guide database searches leading to the identification and validation of a large number of glycopeptide variants including novel deoxyhexose (fucose) modifications in the linkage region of chondroitin sulfate proteoglycans. PMID:27399812

  9. Identification of new serum markers of pathological states by bioinformatic tools for the analysis of serum proteomics expression profiles

    International Nuclear Information System (INIS)

    We have developed new bioinformatic tools and strategies, aimed to the identification and characterization of proteins as markers of pathological states, for the analysis of data derived from protein expression profiles obtained by mass spectrometry techniques, for the study of structural and functional properties of the proteins, and for the analysis of data from omics approaches

  10. Identification and Characterization of miRNAs in Chondrus crispus by High-Throughput Sequencing and Bioinformatics Analysis.

    Science.gov (United States)

    Gao, Fan; Nan, FangRu; Song, Wei; Feng, Jia; Lv, JunPing; Xie, ShuLian

    2016-01-01

    Chondrus crispus, an economically and medicinally important red alga, is a medicinally active substance and important for anti-tumor research. In this study, 117 C. crispus miRNAs (108 conserved and 9 novel) were identified from 2,416,181 small-RNA reads using high-throughput sequencing and bioinformatics methods. According to the BLAST search against the miRBase database, these miRNAs belonged to 110 miRNA families. Sequence alignment combined with homology searching revealed both the conservation and diversity of predicted potential miRNA families in different plant species. Four and 19 randomly selected miRNAs were validated by northern blotting and stem-loop quantitative real-time reverse transcription polymerase chain reaction detection, respectively. The validation rates (75% and 94.7%) demonstrated that most of the identified miRNAs could be credible. A total of 160 potential target genes were predicted and functionally annotated by Gene Ontology analysis and Kyoto Encyclopedia of Genes and Genomes analysis. We also analyzed the interrelationship of miRNAs, miRNA-target genes and target genes in C. crispus by constructing a Cytoscape network. The 117 miRNAs identified in our study should supply large quantities of information that will be important for red algae small RNA research. PMID:27193824

  11. Bioinformatic analysis ofhuman nuclear receptornr5a2(hblf) genomic sequence

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    We have cloned the cDNA of human nuclear receptor nrSa2(hb1f) gene and obtained its whole genomic sequence previously. In this work we carried out in-depth bioinformatic analysis on the genomic sequence of nrSa2(hb1f) gene. Sequence comparison and prediction algorithms implicated that there might be additional coding regions in the 210 kb genomic sequence besides known exons,especially in the two largest introns. Comparison of the structures of nr5a loci in different species revealed distinguishable conservation and apparent gene duplication during evolution. The remarkable conservation among promoters of zebrafish, mouse and human nr5a2 genes suggested that they would be regulated by the same transcription factors.

  12. Bioinformatics analysis of breast cancer bone metastasis related gene-CXCR4

    Institute of Scientific and Technical Information of China (English)

    Heng-Wei Zhang; Xian-Fu Sun; Ya-Ning He; Jun-Tao Li; Xu-Hui Guo; Hui Liu

    2013-01-01

    Objective: To analyze breast cancer bone metastasis related gene-CXCR4. Methods: This research screened breast cancer bone metastasis related genes by high-flux gene chip. Results:It was found that the expressions of 396 genes were different including 165 up-regulations and 231 down-regulations. The expression of chemokine receptor CXCR4 was obviously up-regulated in the tissue with breast cancer bone metastasis. Compared with the tissue without bone metastasis, there was significant difference, which indicated that CXCR4 played a vital role in breast cancer bone metastasis. Conclusions: The bioinformatics analysis of CXCR4 can provide a certain basis for the occurrence and diagnosis of breast cancer bone metastasis, target gene therapy and evaluation of prognosis.

  13. Experimental Design and Bioinformatics Analysis for the Application of Metagenomics in Environmental Sciences and Biotechnology.

    Science.gov (United States)

    Ju, Feng; Zhang, Tong

    2015-11-01

    Recent advances in DNA sequencing technologies have prompted the widespread application of metagenomics for the investigation of novel bioresources (e.g., industrial enzymes and bioactive molecules) and unknown biohazards (e.g., pathogens and antibiotic resistance genes) in natural and engineered microbial systems across multiple disciplines. This review discusses the rigorous experimental design and sample preparation in the context of applying metagenomics in environmental sciences and biotechnology. Moreover, this review summarizes the principles, methodologies, and state-of-the-art bioinformatics procedures, tools and database resources for metagenomics applications and discusses two popular strategies (analysis of unassembled reads versus assembled contigs/draft genomes) for quantitative or qualitative insights of microbial community structure and functions. Overall, this review aims to facilitate more extensive application of metagenomics in the investigation of uncultured microorganisms, novel enzymes, microbe-environment interactions, and biohazards in biotechnological applications where microbial communities are engineered for bioenergy production, wastewater treatment, and bioremediation. PMID:26451629

  14. Bioinformatics investigation of therapeutic mechanisms of Xuesaitong capsule treating ischemic cerebrovascular rat model with comparative transcriptome analysis

    Science.gov (United States)

    Liao, Jiangquan; Wei, Benjun; Chen, Hengwen; Liu, Yongmei; Wang, Jie

    2016-01-01

    Background: Xuesaitong soft capsule (XST) which consists of panax notoginseng saponin (PNS) has been used to treat ischemic cerebrovascular diseases in China. The therapeutic mechanism of XST has not been elucidated yet from prospective of genomics and bioinformatics. Methods: A transcriptome analysis was performed to review series concerning middle cerebral artery occlusion (MCAO) rat model and XST intervention after MCAO from Gene Expression Omnibus (GEO) database. Differentially expressed genes (DEGs) were compared between blank group and model group, model group and XST group. Functional enrichment and pathway analysis were performed. Protein-Protein interaction network was constructed. The overlapping genes from two DEGs sets were screened out and profound analysis was performed. Results: Two series including 22 samples were obtained. 870 DEGs were identified between blank group and model group, and 1189 DEGs were identified between model group and XST group. GO terms and KEGG pathways of MCAO and XST intervention were significantly enriched. PPI networks were constructed to demonstrate the gene-gene interactions. The overlapping genes from two DEGs sets were highlighted. ANTXR2, FHL3, PRCP, TYROBP, TAF9B, FGFR2, BCL11B, RB1CC1 and MBNL2 were the pivotal genes and possible action sites of XST therapeutic mechanisms. Conclusion: MCAO is a pathological process with multiple. PMID:27347353

  15. Identification of novel highly expressed genes in pancreatic ductal adenocarcinomas through a bioinformatics analysis of expressed sequence tags.

    Science.gov (United States)

    Cao, Dengfeng; Hustinx, Steven R; Sui, Guoping; Bala, P; Sato, Norihiro; Martin, Sean; Maitra, Anirban; Murphy, Kathleen M; Cameron, John L; Yeo, Charles J; Kern, Scott E; Goggins, Michael; Pandey, Akhilesh; Hruban, Ralph H

    2004-11-01

    In most microarray experiments, a significant fraction of the differentially expressed mRNAs identified correspond to expressed sequence tags (ESTs) and are generally discarded from further analyses. We used careful bioinformatics analyses to characterize those ESTs that were found to be highly overexpressed in a series of pancreatic adenocarcinomas. cDNA was prepared from 60 non-neoplastic samples (normal pancreas [n = 20], normal colon [n = 10], or normal duodenal mucosal [n = 30]) and from 64 pancreatic cancers (resected cancers [n = 50] or cancer cell lines [n = 14]) and hybridized to the complete Affymetrix Human Genome U133 GeneChip(R) set (arrays U133A and B) for simultaneous analysis of 45,000 fragments corresponding to 33,000 known genes and 6,000 ESTs. The GeneExpress(R) software system Fold Change Analysis Tool was used and 60 ESTs were identified that were expressed at levels at least 3-fold greater in the pancreatic cancers as compared to normal tissues. Searches against the human genomic sequence and comparative genomic analysis of human and mouse genomes was carried out using basic local alignment search tools (BLAST), BLASTN, and BLASTX, for identifying protein coding genes corresponding to the ESTs. Subsequently, in order to pick the most relevant candidate genes for a more detailed analysis, we looked for domains/motifs in the open reading frames using SMART and Pfam programs. We were able to definitively map 43 of the 60 ESTs to known or novel genes, and 15 of the ESTs could be localized in close proximity to a gene in the human genome although we were unable to establish that the EST was indeed derived from those genes. The differential expression of a subset of genes was confirmed at the protein level by immunohistochemical labeling of tissue microarrays (inhibin beta A [INHBA] and CD29) and/or at the transcript level by RT-PCR (INHBA, AKAP12, ELK3, FOXQ1, EIF5A2, and EFNA5). We conclude that bioinformatics tools can be used to characterize

  16. Flow cytometry bioinformatics.

    Directory of Open Access Journals (Sweden)

    Kieran O'Neill

    Full Text Available Flow cytometry bioinformatics is the application of bioinformatics to flow cytometry data, which involves storing, retrieving, organizing, and analyzing flow cytometry data using extensive computational resources and tools. Flow cytometry bioinformatics requires extensive use of and contributes to the development of techniques from computational statistics and machine learning. Flow cytometry and related methods allow the quantification of multiple independent biomarkers on large numbers of single cells. The rapid growth in the multidimensionality and throughput of flow cytometry data, particularly in the 2000s, has led to the creation of a variety of computational analysis methods, data standards, and public databases for the sharing of results. Computational methods exist to assist in the preprocessing of flow cytometry data, identifying cell populations within it, matching those cell populations across samples, and performing diagnosis and discovery using the results of previous steps. For preprocessing, this includes compensating for spectral overlap, transforming data onto scales conducive to visualization and analysis, assessing data for quality, and normalizing data across samples and experiments. For population identification, tools are available to aid traditional manual identification of populations in two-dimensional scatter plots (gating, to use dimensionality reduction to aid gating, and to find populations automatically in higher dimensional space in a variety of ways. It is also possible to characterize data in more comprehensive ways, such as the density-guided binary space partitioning technique known as probability binning, or by combinatorial gating. Finally, diagnosis using flow cytometry data can be aided by supervised learning techniques, and discovery of new cell types of biological importance by high-throughput statistical methods, as part of pipelines incorporating all of the aforementioned methods. Open standards, data

  17. Small envelope protein E of SARS:cloning,expression, purification, CD determination, and bioinformatics analysis

    Institute of Scientific and Technical Information of China (English)

    SHENXu; XUEJian-Hua; YUChang-Ying; LUOHai-Bin; QINLei; YUXiao-Jing; CHENJing; CHENLi-Li; XIONGBin; YUELi-Duo; CAIJian-Hua; SHENJian-Hua; LUOXiao-Min; CHENKai-Xian; SHITie-Liu; LIYi-Xue; HUGeng-Xi; JIANGHua-Liang

    2003-01-01

    AIM:To obtain the pure sample of SARS small envelope E protein (SARS E protein), study its properties and analyze its possible functions. METHODS: The plasmid of SARS E protein was constructed by the polymerase chain reaction (PCR), and the protein was expressed in the E coli strain. The secondary structure feature of the protein was determined by circular dichroism (CD) technique. The possible functions of this protein were annotated by bioinformatics methods, and its possible three-dimensional model was constructed by molecular modeling. RESULTS: The pure sample of SARS E protein was obtained. The secondary structure feature derived from CD determination is similar to that from the secondary structure prediction. Bioinformatics analysis indicated that the key residues of SARS E protein were much conserved compared to the E proteins of other coronaviruses. In particular, the primary amino acid sequence of SARS E protien is much more similar to that of murine hepatitis virus(MHV) and other mammal coronaviruses. The transmembrane (TM) segment of the SARS E protein is relatively more conserved in the whole protein than other regions. CONCLUSION: The success of expressing the SARS E protein is a good starting point for investigating the structure and functions of this protein and SARS coronavirus itself as well. The SARS E protein may fold in water solution in a similar way as it in membrane-water mixed environment. It is possible that β-sheet I of the SARS E protein interacts with the membrane surface via hydrogen bonding, this β-sheet may uncoil to a random structure in water solution.

  18. Secretome Analysis of Lipid-Induced Insulin Resistance in Skeletal Muscle Cells by a Combined Experimental and Bioinformatics Workflow

    DEFF Research Database (Denmark)

    Deshmukh, Atul S; Cox, Juergen; Jensen, Lars Juhl;

    2015-01-01

    , in principle, allows an unbiased and comprehensive analysis of cellular secretomes; however, the distinction of bona fide secreted proteins from proteins released upon lysis of a small fraction of dying cells remains challenging. Here we applied highly sensitive MS and streamlined bioinformatics to...

  19. BATMAN-TCM: a Bioinformatics Analysis Tool for Molecular mechANism of Traditional Chinese Medicine

    Science.gov (United States)

    Liu, Zhongyang; Guo, Feifei; Wang, Yong; Li, Chun; Zhang, Xinlei; Li, Honglei; Diao, Lihong; Gu, Jiangyong; Wang, Wei; Li, Dong; He, Fuchu

    2016-02-01

    Traditional Chinese Medicine (TCM), with a history of thousands of years of clinical practice, is gaining more and more attention and application worldwide. And TCM-based new drug development, especially for the treatment of complex diseases is promising. However, owing to the TCM’s diverse ingredients and their complex interaction with human body, it is still quite difficult to uncover its molecular mechanism, which greatly hinders the TCM modernization and internationalization. Here we developed the first online Bioinformatics Analysis Tool for Molecular mechANism of TCM (BATMAN-TCM). Its main functions include 1) TCM ingredients’ target prediction; 2) functional analyses of targets including biological pathway, Gene Ontology functional term and disease enrichment analyses; 3) the visualization of ingredient-target-pathway/disease association network and KEGG biological pathway with highlighted targets; 4) comparison analysis of multiple TCMs. Finally, we applied BATMAN-TCM to Qishen Yiqi dripping Pill (QSYQ) and combined with subsequent experimental validation to reveal the functions of renin-angiotensin system responsible for QSYQ’s cardioprotective effects for the first time. BATMAN-TCM will contribute to the understanding of the “multi-component, multi-target and multi-pathway” combinational therapeutic mechanism of TCM, and provide valuable clues for subsequent experimental validation, accelerating the elucidation of TCM’s molecular mechanism. BATMAN-TCM is available at http://bionet.ncpsb.org/batman-tcm.

  20. Bioinformatics analysis and expression of a novel protein ROP48 in Toxoplasma gondii.

    Science.gov (United States)

    Zhou, Jian; Wang, Lin; Zhou, Aihua; Lu, Gang; Li, Qihang; Wang, Zhilin; Zhu, Meiyan; Zhou, Huaiyu; Cong, Hua; He, Shenyi

    2016-06-01

    Toxoplasma gondii is an obligate intracellular apicomplexan parasite, and can infect warmblooded animals and humans all over the world. In the past years, ROP family genes encoding particular proteins of T. gondii had made a great contribution to toxoplasmosis. In this study, we used multiple bioinformatics approaches to predict the physical and chemical characteristics, transmembrane domain, epitope, and topological structure of the rhoptry protein 48 (ROP48). The results indicated that ROP48 protein was mainly located in the membrane and had several positive linear-B cell epitopes and Th-cell epitopes, which suggested that ROP48 is a potential DNA vaccine candidate against toxoplasmosis. Then the PCR product amplified from the ROP48 cDNA was inserted into a pEASY-T1 vector to build a recombinant cloning plasmid. After sequencing, ROP48 was subcloned into a eukaryotic expression plasmid pEGFP-C1 to obtain pEGFP-C1-ROP48 (pROP48). After identification by PCR and restriction enzyme digestion, the recombinant plasmid pROP48 was transfected into HEK 293-T cell and identified by RT-PCR. The results showed that the eukaryotic expression plasmid pROP48 was constructed and transfected to the cells of HEK 293-T successfully. Western blotting showed that the expressed proteins can be recognized by anti-STAg mouse sera. PMID:27078655

  1. Bioinformatic methods in protein characterization

    OpenAIRE

    Kallberg, Yvonne

    2002-01-01

    Bioinformatics is an emerging interdisciplinary research field in which mathematics. computer science and biology meet. In this thesis. bioinformatic methods for analysis of functional and structural properties among proteins will be presented. I have developed and applied bioinformatic methods on the enzyme superfamily of short-chain dehydrogenases/reductases (SDRs), coenzyme-binding enzymes of the Rossmann fold type, and amyloid-forming proteins and peptides. The basis...

  2. String Mining in Bioinformatics

    Science.gov (United States)

    Abouelhoda, Mohamed; Ghanem, Moustafa

    Sequence analysis is a major area in bioinformatics encompassing the methods and techniques for studying the biological sequences, DNA, RNA, and proteins, on the linear structure level. The focus of this area is generally on the identification of intra- and inter-molecular similarities. Identifying intra-molecular similarities boils down to detecting repeated segments within a given sequence, while identifying inter-molecular similarities amounts to spotting common segments among two or multiple sequences. From a data mining point of view, sequence analysis is nothing but string- or pattern mining specific to biological strings. For a long time, this point of view, however, has not been explicitly embraced neither in the data mining nor in the sequence analysis text books, which may be attributed to the co-evolution of the two apparently independent fields. In other words, although the word "data-mining" is almost missing in the sequence analysis literature, its basic concepts have been implicitly applied. Interestingly, recent research in biological sequence analysis introduced efficient solutions to many problems in data mining, such as querying and analyzing time series [49,53], extracting information from web pages [20], fighting spam mails [50], detecting plagiarism [22], and spotting duplications in software systems [14].

  3. Effect of Wnt3a on Keratinocytes Utilizing in Vitro and Bioinformatics Analysis

    Directory of Open Access Journals (Sweden)

    Ju-Suk Nam

    2014-03-01

    Full Text Available Wingless-type (Wnt signaling proteins participate in various cell developmental processes. A suppressive role of Wnt5a on keratinocyte growth has already been observed. However, the role of other Wnt proteins in proliferation and differentiation of keratinocytes remains unknown. Here, we investigated the effects of the Wnt ligand, Wnt3a, on proliferation and differentiation of keratinocytes. Keratinocytes from normal human skin were cultured and treated with recombinant Wnt3a alone or in combination with the inflammatory cytokine, tumor necrosis factor α (TNFα. Furthermore, using bioinformatics, we analyzed the biochemical parameters, molecular evolution, and protein–protein interaction network for the Wnt family. Application of recombinant Wnt3a showed an anti-proliferative effect on keratinocytes in a dose-dependent manner. After treatment with TNFα, Wnt3a still demonstrated an anti-proliferative effect on human keratinocytes. Exogenous treatment of Wnt3a was unable to alter mRNA expression of differentiation markers of keratinocytes, whereas an altered expression was observed in TNFα-stimulated keratinocytes. In silico phylogenetic, biochemical, and protein–protein interaction analysis showed several close relationships among the family members of the Wnt family. Moreover, a close phylogenetic and biochemical similarity was observed between Wnt3a and Wnt5a. Finally, we proposed a hypothetical mechanism to illustrate how the Wnt3a protein may inhibit the process of proliferation in keratinocytes, which would be useful for future researchers.

  4. Cloning, expression and bioinformatics analysis of ATP sulfurylase from Acidithiobacillus ferrooxidans ATCC 23270 in Escherichia coli.

    Science.gov (United States)

    Jaramillo, Michael L; Abanto, Michel; Quispe, Ruth L; Calderón, Julio; Del Valle, Luís J; Talledo, Miguel; Ramírez, Pablo

    2012-01-01

    Molecular studies of enzymes involved in sulfite oxidation in Acidithiobacillus ferrooxidans have not yet been developed, especially in the ATP sulfurylase (ATPS) of these acidophilus tiobacilli that have importance in biomining. This enzyme synthesizes ATP and sulfate from adenosine phosphosulfate (APS) and pyrophosphate (PPi), final stage of the sulfite oxidation by these organisms in order to obtain energy. The atpS gene (1674 bp) encoding the ATPS from Acidithiobacillus ferrooxidans ATCC 23270 was amplified using PCR, cloned in the pET101-TOPO plasmid, sequenced and expressed in Escherichia coli obtaining a 63.5 kDa ATPS recombinant protein according to SDS-PAGE analysis. The bioinformatics and phylogenetic analyses determined that the ATPS from A. ferrooxidans presents ATP sulfurylase (ATS) and APS kinase (ASK) domains similar to ATPS of Aquifex aeolicus, probably of a more ancestral origin. Enzyme activity towards ATP formation was determined by quantification of ATP formed from E. coli cell extracts, using a bioluminescence assay based on light emission by the luciferase enzyme. Our results demonstrate that the recombinant ATP sulfurylase from A. ferrooxidans presents an enzymatic activity for the formation of ATP and sulfate, and possibly is a bifunctional enzyme due to its high homology to the ASK domain from A. aeolicus and true kinases. PMID:23055613

  5. Integration and bioinformatics analysis of DNA-methylated genes associated with drug resistance in ovarian cancer

    Science.gov (United States)

    YAN, BINGBING; YIN, FUQIANG; WANG, QI; ZHANG, WEI; LI, LI

    2016-01-01

    The main obstacle to the successful treatment of ovarian cancer is the development of drug resistance to combined chemotherapy. Among all the factors associated with drug resistance, DNA methylation apparently plays a critical role. In this study, we performed an integrative analysis of the 26 DNA-methylated genes associated with drug resistance in ovarian cancer, and the genes were further evaluated by comprehensive bioinformatics analysis including gene/protein interaction, biological process enrichment and annotation. The results from the protein interaction analyses revealed that at least 20 of these 26 methylated genes are present in the protein interaction network, indicating that they interact with each other, have a correlation in function, and may participate as a whole in the regulation of ovarian cancer drug resistance. There is a direct interaction between the phosphatase and tensin homolog (PTEN) gene and at least half of the other genes, indicating that PTEN may possess core regulatory functions among these genes. Biological process enrichment and annotation demonstrated that most of these methylated genes were significantly associated with apoptosis, which is possibly an essential way for these genes to be involved in the regulation of multidrug resistance in ovarian cancer. In addition, a comprehensive analysis of clinical factors revealed that the methylation level of genes that are associated with the regulation of drug resistance in ovarian cancer was significantly correlated with the prognosis of ovarian cancer. Overall, this study preliminarily explains the potential correlation between the genes with DNA methylation and drug resistance in ovarian cancer. This finding has significance for our understanding of the regulation of resistant ovarian cancer by methylated genes, the treatment of ovarian cancer, and improvement of the prognosis of ovarian cancer. PMID:27347118

  6. Analysis of Metagenomics Next Generation Sequence Data for Fungal ITS Barcoding: Do You Need Advance Bioinformatics Experience?

    Science.gov (United States)

    Ahmed, Abdalla

    2016-01-01

    During the last few decades, most of microbiology laboratories have become familiar in analyzing Sanger sequence data for ITS barcoding. However, with the availability of next-generation sequencing platforms in many centers, it has become important for medical mycologists to know how to make sense of the massive sequence data generated by these new sequencing technologies. In many reference laboratories, the analysis of such data is not a big deal, since suitable IT infrastructure and well-trained bioinformatics scientists are always available. However, in small research laboratories and clinical microbiology laboratories the availability of such resources are always lacking. In this report, simple and user-friendly bioinformatics work-flow is suggested for fast and reproducible ITS barcoding of fungi. PMID:27507959

  7. Analysis of Metagenomics Next Generation Sequence Data for Fungal ITS Barcoding: Do You Need Advance Bioinformatics Experience?

    Science.gov (United States)

    Ahmed, Abdalla

    2016-01-01

    During the last few decades, most of microbiology laboratories have become familiar in analyzing Sanger sequence data for ITS barcoding. However, with the availability of next-generation sequencing platforms in many centers, it has become important for medical mycologists to know how to make sense of the massive sequence data generated by these new sequencing technologies. In many reference laboratories, the analysis of such data is not a big deal, since suitable IT infrastructure and well-trained bioinformatics scientists are always available. However, in small research laboratories and clinical microbiology laboratories the availability of such resources are always lacking. In this report, simple and user-friendly bioinformatics work-flow is suggested for fast and reproducible ITS barcoding of fungi.

  8. Bioinformatic analysis of pathogenic missense mutations of activin receptor like kinase 1 ectodomain.

    Directory of Open Access Journals (Sweden)

    Claudia Scotti

    Full Text Available Activin A receptor, type II-like kinase 1 (also called ALK1, is a serine-threonine kinase predominantly expressed on endothelial cells surface. Mutations in its ACVRL1 encoding gene (12q11-14 cause type 2 Hereditary Haemorrhagic Telangiectasia (HHT2, an autosomal dominant multisystem vascular dysplasia. The study of the structural effects of mutations is crucial to understand their pathogenic mechanism. However, while an X-ray structure of ALK1 intracellular domain has recently become available (PDB ID: 3MY0, structure determination of ALK1 ectodomain (ALK1(EC has been elusive so far. We here describe the building of a homology model for ALK1(EC, followed by an extensive bioinformatic analysis, based on a set of 38 methods, of the effect of missense mutations at the sequence and structural level. ALK1(EC potential interaction mode with its ligand BMP9 was then predicted combining modelling and docking data. The calculated model of the ALK1(EC allowed mapping and a preliminary characterization of HHT2 associated mutations. Major structural changes and loss of stability of the protein were predicted for several mutations, while others were found to interfere mainly with binding to BMP9 or other interactors, like Endoglin (CD105, whose encoding ENG gene (9q34 mutations are known to cause type 1 HHT. This study gives a preliminary insight into the potential structure of ALK1(EC and into the structural effects of HHT2 associated mutations, which can be useful to predict the potential effect of each single mutation, to devise new biological experiments and to interpret the biological significance of new mutations, private mutations, or non-synonymous polymorphisms.

  9. A Critical Analysis of Assessment Quality in Genomics and Bioinformatics Education Research

    Science.gov (United States)

    Campbell, Chad E.; Nehm, Ross H.

    2013-01-01

    The growing importance of genomics and bioinformatics methods and paradigms in biology has been accompanied by an explosion of new curricula and pedagogies. An important question to ask about these educational innovations is whether they are having a meaningful impact on students' knowledge, attitudes, or skills. Although assessments are…

  10. Bioinformatic analysis of functional differences between the immunoproteasome and the constitutive proteasome

    DEFF Research Database (Denmark)

    Kesmir, Can; van Noort, V.; de Boer, R.J.;

    2003-01-01

    not yet been quantified how different the specificity of two forms of the proteasome are. The main question, which still lacks direct evidence, is whether the immunoproteasome generates more MHC ligands. Here we use bioinformatics tools to quantify these differences and show that the immunoproteasome...

  11. BIOINFORMATICS AND BIOSYNTHESIS ANALYSIS OF CELLULOSE SYNTHASE OPERON IN ZYMOMONAS MOBILIS ZM4

    Directory of Open Access Journals (Sweden)

    Sheik Abdul Kader Sheik Asraf, K. Narayanan Rajnish, and Paramasamy Gunasekaran

    2011-03-01

    confirmed by the Acetic-Nitric (Updegraff Cellulose assay. The Bioinformatics and biosynthetic analysis confirm the biosynthesis of cellulose in Z. mobilis.

  12. Identification and bioinformatics analysis of lactate dehydrogenase genes fromEchinococcus granulosus

    Institute of Scientific and Technical Information of China (English)

    Gang Lu; Yajun Lu; Lihua Li; Lixian Wu; Zhigang Fan; Dazhong Shi; Hu Wang; Xiumin Han

    2010-01-01

    Objective:To identify full length cDNA sequence of lactate dehydrogenase(LDH) from adultEchinococcus granulosus (E. granulosus) and to predict the structure and function of its encoding protein using bioinformatics methods.Methods: With the help ofNCBI, EMBI, Expasy and other online sites, the open reading frame (ORF), conserved domain, physical and chemical parameters, signal peptide, epitope, topological structures of the protein sequences were predicted and a homology tertiary structure model was created; VectorNTI software was used for sequence alignment, phylogenetic tree construction and tertiary structure prediction. Results: The target sequence was1 233 bp length with a996 bp biggestORFencoding331 amino acids protein with typicalL-LDH conserved domain. It was confirmed as full length cDNA of LDH fromE. granulosus and named asEgLDH (GenBank accession number:HM748917). The predicted molecular weight and isoelectric point of the deduced protein were3 5516.2Da and6.32 respectively. Compared withLDHs fromTaenia solium, Taenia saginata asiatica, Spirometra erinaceieuropaei, Schistosoma japonicum, Clonorchis sinensis and human, it showed similarity of 86%, 85%, 55%, 58%, 58% and 53%, respectively. EgLDH contained3putative transmembrane regions and4 major epitopes (54aa-59aa,81aa-87aa,97aa-102aa,307aa-313aa), the latter were significant different from the corresponding regions of humanLDH. In addition, someNAD and substrate binding sites located on epitopes54aa-59aa and97aa-102aa, respectively. Tertiary structure prediction showed that3 key catalytic residues105R, 165D and192H forming a catalytic center near the epitope97aa-102aa, mostNAD and substrate binding sites located around the center.Conclusions: The full length cDNA sequences of EgLDH were identified. It encoded a putative transmembrane protein which might be an ideal target molecule for vaccine and drugs.

  13. AbMiner: A bioinformatic resource on available monoclonal antibodies and corresponding gene identifiers for genomic, proteomic, and immunologic studies

    Directory of Open Access Journals (Sweden)

    Shankavaram Uma

    2006-04-01

    Full Text Available Abstract Background Monoclonal antibodies are used extensively throughout the biomedical sciences for detection of antigens, either in vitro or in vivo. We, for example, have used them for quantitation of proteins on "reverse-phase" protein lysate arrays. For those studies, we quality-controlled > 600 available monoclonal antibodies and also needed to develop precise information on the genes that encode their antigens. Translation among the various protein and gene identifier types proved non-trivial because of one-to-many and many-to-one relationships. To organize the antibody, protein, and gene information, we initially developed a relational database in Filemaker for our own use. When it became apparent that the information would be useful to many other researchers faced with the need to choose or characterize antibodies, we developed it further as AbMiner, a fully relational web-based database under MySQL, programmed in Java. Description AbMiner is a user-friendly, web-based relational database of information on > 600 commercially available antibodies that we validated by Western blot for protein microarray studies. It includes many types of information on the antibody, the immunogen, the vendor, the antigen, and the antigen's gene. Multiple gene and protein identifier types provide links to corresponding entries in a variety of other public databases, including resources for phosphorylation-specific antibodies. AbMiner also includes our quality-control data against a pool of 60 diverse cancer cell types (the NCI-60 and also protein expression levels for the NCI-60 cells measured using our high-density "reverse-phase" protein lysate microarrays for a selection of the listed antibodies. Some other available database resources give information on antibody specificity for one or a couple of cell types. In contrast, the data in AbMiner indicate specificity with respect to the antigens in a pool of 60 diverse cell types from nine different

  14. New bioinformatic tools for analysis of nucleotide modifications in eukaryotic rRNA

    OpenAIRE

    Piekna-Przybylska, Dorota; Decatur, Wayne A.; Fournier, Maurille J.

    2007-01-01

    This report presents a valuable new bioinformatics package for research on rRNA nucleotide modifications in the ribosome, especially those created by small nucleolar RNA:protein complexes (snoRNPs). The interactive service, which is not available elsewhere, enables a user to visualize the positions of pseudouridines, 2′-O-methylations, and base methylations in three-dimensional space in the ribosome and also in linear and secondary structure formats of ribosomal RNA. Our tools provide additio...

  15. A Critical Analysis of Assessment Quality in Genomics and Bioinformatics Education Research

    OpenAIRE

    Campbell, Chad E.; Nehm, Ross H.

    2013-01-01

    The growing importance of genomics and bioinformatics methods and paradigms in biology has been accompanied by an explosion of new curricula and pedagogies. An important question to ask about these educational innovations is whether they are having a meaningful impact on students’ knowledge, attitudes, or skills. Although assessments are necessary tools for answering this question, their outputs are dependent on their quality. Our study 1) reviews the central importance of reliability and con...

  16. Dynamic hybrid clustering of bioinformatics by incorporating text mining and citation analysis.

    OpenAIRE

    Janssens, Frizo; Glänzel, Wolfgang; De Moor, Bart

    2007-01-01

    To unravel the concept structure and dynamics of the bioinformatics field, we analyze a set of 7401 publications from the Web of Science and MEDLINE databases, publication years 1981–2004. For delineating this complex, interdisciplinary field, a novel bibliometric retrieval strategy is used. Given that the performance of unsupervised clustering and classification of scientific publications is significantly improved by deeply merging textual contents with the structure of the citation graph, w...

  17. Advantages and disadvantages in usage of bioinformatic programs in promoter region analysis

    Science.gov (United States)

    Pawełkowicz, Magdalena E.; Skarzyńska, Agnieszka; Posyniak, Kacper; ZiÄ bska, Karolina; PlÄ der, Wojciech; Przybecki, Zbigniew

    2015-09-01

    An important computational challenge is finding the regulatory elements across the promotor region. In this work we present the advantages and disadvantages from the application of different bioinformatics programs for localization of transcription factor binding sites in the upstream region of genes connected with sex determination in cucumber. We use PlantCARE, PlantPAN and SignalScan to find motifs in the promotor regions. The results have been compared and possible function of chosen motifs has been described.

  18. Bioinformatic Analysis for the Validation of Novel Biomarkers for Cancer Diagnosis and Drug Sensitivity

    OpenAIRE

    Lockwood, Laura Anne Rebecca

    2015-01-01

    Background: The genetic control of tumour progression presents the opportunity for bioinformatics and gene expression data to be used as a basis for tumour grading. The development of a genetic signature based on microarray data allows for the development of personalised chemotherapeutic regimes. Method: ONCOMINE was utilised to create a genetic signature for ovarian serous adenocarcinoma and to compare the expression of genes between normal ovarian and cancerous cells. Ingenuity Pathways...

  19. GProX, a User-Friendly Platform for Bioinformatics Analysis and Visualization of Quantitative Proteomics Data

    DEFF Research Database (Denmark)

    Rigbolt, Kristoffer T G; Vanselow, Jens T; Blagoev, Blagoy

    2011-01-01

    -friendly platform for comprehensive analysis, inspection and visualization of quantitative proteomics data we developed the Graphical Proteomics Data Explorer (GProX)(1). The program requires no special bioinformatics training, as all functions of GProX are accessible within its graphical user-friendly interface...... which will be intuitive to most users. Basic features facilitate the uncomplicated management and organization of large data sets and complex experimental setups as well as the inspection and graphical plotting of quantitative data. These are complemented by readily available high-level analysis options...... such as database querying, clustering based on abundance ratios, feature enrichment tests for e.g. GO terms and pathway analysis tools. A number of plotting options for visualization of quantitative proteomics data is available and most analysis functions in GProX create customizable high quality...

  20. Bioinformatics for Exploration

    Science.gov (United States)

    Johnson, Kathy A.

    2006-01-01

    For the purpose of this paper, bioinformatics is defined as the application of computer technology to the management of biological information. It can be thought of as the science of developing computer databases and algorithms to facilitate and expedite biological research. This is a crosscutting capability that supports nearly all human health areas ranging from computational modeling, to pharmacodynamics research projects, to decision support systems within autonomous medical care. Bioinformatics serves to increase the efficiency and effectiveness of the life sciences research program. It provides data, information, and knowledge capture which further supports management of the bioastronautics research roadmap - identifying gaps that still remain and enabling the determination of which risks have been addressed.

  1. Design and bioinformatics analysis of novel biomimetic peptides as nanocarriers for gene transfer

    Directory of Open Access Journals (Sweden)

    Asia Majidi

    2015-01-01

    Full Text Available Objective(s: The introduction of nucleic acids into cells for therapeutic objectives is significantly hindered by the size and charge of these molecules and therefore requires efficient vectors that assist cellular uptake. For several years great efforts have been devoted to the study of development of recombinant vectors based on biological domains with potential applications in gene therapy. Such vectors have been synthesized in genetically engineered approach, resulting in biomacromolecules with new properties that are not present in nature. Materials and Methods: In this study, we have designed new peptides using homology modeling with the purpose of overcoming the cell barriers for successful gene delivery through Bioinformatics tools. Three different carriers were designed and one of those with better score through Bioinformatics tools was cloned, expressed and its affinity for pDNA was monitored. Results: The resultszz demonstrated that the vector can effectively condense pDNAinto nanoparticles with the average sizes about 100 nm. Conclusion: We hope these peptides can overcome the biological barriers associated with gene transfer, and mediate efficient gene delivery.

  2. Hypothetical granulin-like molecule from Fasciola hepatica identified by bioinformatics analysis

    OpenAIRE

    Machicado, Claudia; Marcos, Luis A.; Zimic, Mirko

    2016-01-01

    Fasciola hepatica is considered an emergent human pathogen, causing liver fibrosis or cirrhosis, conditions that are known to be direct causes of cancer. Some parasites have been categorized by WHO as carcinogenic agents such as Opisthorchis viverrini, a relative of F. hepatica. Although these two parasites are from the same class (Trematoda), the role of F. hepatica in carcinogenesis is unclear. We hypothesized that F. hepatica might share some features with O. viverrini and to be responsibl...

  3. In the Spotlight: Bioinformatics

    Science.gov (United States)

    Wang, May Dongmei

    2016-01-01

    During 2012, next generation sequencing (NGS) has attracted great attention in the biomedical research community, especially for personalized medicine. Also, third generation sequencing has become available. Therefore, state-of-art sequencing technology and analysis are reviewed in this Bioinformatics spotlight on 2012. Next-generation sequencing (NGS) is high-throughput nucleic acid sequencing technology with wide dynamic range and single base resolution. The full promise of NGS depends on the optimization of NGS platforms, sequence alignment and assembly algorithms, data analytics, novel algorithms for integrating NGS data with existing genomic, proteomic, or metabolomic data, and quantitative assessment of NGS technology in comparing to more established technologies such as microarrays. NGS technology has been predicated to become a cornerstone of personalized medicine. It is argued that NGS is a promising field for motivated young researchers who are looking for opportunities in bioinformatics. PMID:23192635

  4. Analysis of RNAseq datasets from a comparative infectious disease zebrafish model using GeneTiles bioinformatics.

    Science.gov (United States)

    Veneman, Wouter J; de Sonneville, Jan; van der Kolk, Kees-Jan; Ordas, Anita; Al-Ars, Zaid; Meijer, Annemarie H; Spaink, Herman P

    2015-03-01

    We present a RNA deep sequencing (RNAseq) analysis of a comparison of the transcriptome responses to infection of zebrafish larvae with Staphylococcus epidermidis and Mycobacterium marinum bacteria. We show how our developed GeneTiles software can improve RNAseq analysis approaches by more confidently identifying a large set of markers upon infection with these bacteria. For analysis of RNAseq data currently, software programs such as Bowtie2 and Samtools are indispensable. However, these programs that are designed for a LINUX environment require some dedicated programming skills and have no options for visualisation of the resulting mapped sequence reads. Especially with large data sets, this makes the analysis time consuming and difficult for non-expert users. We have applied the GeneTiles software to the analysis of previously published and newly obtained RNAseq datasets of our zebrafish infection model, and we have shown the applicability of this approach also to published RNAseq datasets of other organisms by comparing our data with a published mammalian infection study. In addition, we have implemented the DEXSeq module in the GeneTiles software to identify genes, such as glucagon A, that are differentially spliced under infection conditions. In the analysis of our RNAseq data, this has led to the possibility to improve the size of data sets that could be efficiently compared without using problem-dedicated programs, leading to a quick identification of marker sets. Therefore, this approach will also be highly useful for transcriptome analyses of other organisms for which well-characterised genomes are available. PMID:25503064

  5. Phylogenetic trees in bioinformatics

    Energy Technology Data Exchange (ETDEWEB)

    Burr, Tom L [Los Alamos National Laboratory

    2008-01-01

    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding the best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.

  6. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software.

    Science.gov (United States)

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians. PMID:25996054

  7. Clustering Techniques in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Muhammad Ali Masood

    2015-01-01

    Full Text Available Dealing with data means to group information into a set of categories either in order to learn new artifacts or understand new domains. For this purpose researchers have always looked for the hidden patterns in data that can be defined and compared with other known notions based on the similarity or dissimilarity of their attributes according to well-defined rules. Data mining, having the tools of data classification and data clustering, is one of the most powerful techniques to deal with data in such a manner that it can help researchers identify the required information. As a step forward to address this challenge, experts have utilized clustering techniques as a mean of exploring hidden structure and patterns in underlying data. Improved stability, robustness and accuracy of unsupervised data classification in many fields including pattern recognition, machine learning, information retrieval, image analysis and bioinformatics, clustering has proven itself as a reliable tool. To identify the clusters in datasets algorithm are utilized to partition data set into several groups based on the similarity within a group. There is no specific clustering algorithm, but various algorithms are utilized based on domain of data that constitutes a cluster and the level of efficiency required. Clustering techniques are categorized based upon different approaches. This paper is a survey of few clustering techniques out of many in data mining. For the purpose five of the most common clustering techniques out of many have been discussed. The clustering techniques which have been surveyed are: K-medoids, K-means, Fuzzy C-means, Density-Based Spatial Clustering of Applications with Noise (DBSCAN and Self-Organizing Map (SOM clustering.

  8. Identification of key pathways and genes in colorectal cancer using bioinformatics analysis.

    Science.gov (United States)

    Liang, Bin; Li, Chunning; Zhao, Jianying

    2016-10-01

    Colorectal cancer (CRC) is the most common malignant tumor of digestive system. The aim of this study was to identify gene signatures during CRC and uncover their potential mechanisms. The gene expression profiles of GSE21815 were downloaded from GEO database. The GSE21815 dataset contained 141 samples, including 132 CRC and 9 normal colon epitheliums. The gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) enrichment analyses were performed, and protein-protein interaction (PPI) network of the differentially expressed genes (DEGs) was constructed by Cytoscape software. In total, 3500 DEGs were identified in CRC, including 1370 up-regulated genes and 2130 down-regulated genes. GO analysis results showed that up-regulated DEGs were significantly enriched in biological processes (BP), including cell cycle, cell division, and cell proliferation; the down-regulated DEGs were significantly enriched in biological processes, including immune response, intracellular signaling cascade and defense response. KEGG pathway analysis showed the up-regulated DEGs were enriched in cell cycle and DNA replication, while the down-regulated DEGs were enriched in drug metabolism, metabolism of xenobiotics by cytochrome P450, and retinol metabolism pathways. The top 10 hub genes, GNG2, AGT, SAA1, ADCY5, LPAR1, NMU, IL8, CXCL12, GNAI1, and CCR2 were identified from the PPI network, and sub-networks revealed these genes were involved in significant pathways, including G protein-coupled receptors signaling pathway, gastrin-CREB signaling pathway via PKC and MAPK, and extracellular matrix organization. In conclusion, the present study indicated that the identified DEGs and hub genes promote our understanding of the molecular mechanisms underlying the development of CRC, and might be used as molecular targets and diagnostic biomarkers for the treatment of CRC. PMID:27581154

  9. Bioinformatics analysis of the early inflammatory response in a rat thermal injury model

    Directory of Open Access Journals (Sweden)

    Berthiaume Francois

    2007-01-01

    Full Text Available Abstract Background Thermal injury is among the most severe forms of trauma and its effects are both local and systemic. Response to thermal injury includes cellular protection mechanisms, inflammation, hypermetabolism, prolonged catabolism, organ dysfunction and immuno-suppression. It has been hypothesized that gene expression patterns in the liver will change with severe burns, thus reflecting the role the liver plays in the response to burn injury. Characterizing the molecular fingerprint (i.e., expression profile of the inflammatory response resulting from burns may help elucidate the activated mechanisms and suggest new therapeutic intervention. In this paper we propose a novel integrated framework for analyzing time-series transcriptional data, with emphasis on the burn-induced response within the context of the rat animal model. Our analysis robustly identifies critical expression motifs, indicative of the dynamic evolution of the inflammatory response and we further propose a putative reconstruction of the associated transcription factor activities. Results Implementation of our algorithm on data obtained from an animal (rat burn injury study identified 281 genes corresponding to 4 unique profiles. Enrichment evaluation upon both gene ontologies and transcription factors, verifies the inflammation-specific character of the selections and the rationalization of the burn-induced inflammatory response. Conducting the transcription network reconstruction and analysis, we have identified transcription factors, including AHR, Octamer Binding Proteins, Kruppel-like Factors, and cell cycle regulators as being highly important to an organism's response to burn response. These transcription factors are notable due to their roles in pathways that play a part in the gross physiological response to burn such as changes in the immune response and inflammation. Conclusion Our results indicate that our novel selection/classification algorithm has been

  10. Bioinformatics analysis of differentially expressed pathways related to the metastatic characteristics of osteosarcoma.

    Science.gov (United States)

    Sun, Wei; Ma, Xiaojun; Shen, Jiakang; Yin, Fei; Wang, Chongren; Cai, Zhengdong

    2016-08-01

    In this study, gene expression data of osteosarcoma (OSA) were analyzed to identify metastasis-related biological pathways. Four gene expression data sets (GSE21257, GSE9508, GSE49003 and GSE66673) were downloaded from Gene Expression Omnibus (GEO). An analysis of differentially expressed genes (DEGs) was performed using the Significance Analysis of Microarray (SAM) method. Gene expression levels were converted into scores of pathways by the Functional Analysis of Individual Microarray Expression (FAIME) algorithm and the differentially expressed pathways (DEPs) were then disclosed by a t-test. The distinguishing and prediction ability of the DEPs for metastatic and non-metastatic OSA was further confirmed using the principal component analysis (PCA) method and 3 gene expression data sets (GSE9508, GSE49003 and GSE66673) based on the support vector machines (SVM) model. A total of 616 downregulated and 681 upregulated genes were identified in the data set, GSE21257. The DEGs could not be used to distinguish metastatic OSA from non-metastatic OSA, as shown by PCA. Thus, an analysis of DEPs was further performed, resulting in 14 DEPs, such as NRAS signaling, Toll-like receptor (TLR) signaling, matrix metalloproteinase (MMP) regulation of cytokines and tumor necrosis factor receptor-associated factor (TRAF)-mediated interferon regulatory factor 7 (IRF7) activation. Cluster analysis indicated that these pathways could be used to distinguish between metastatic OSA from non-metastatic OSA. The prediction accuracy was 91, 66.7 and 87.5% for the data sets, GSE9508, GSE49003 and GSE66673, respectively. The results of PCA further validated that the DEPs could be used to distinguish metastatic OSA from non-metastatic OSA. On the whole, several DEPs were identified in metastatic OSA compared with non-metastatic OSA. Further studies on these pathways and relevant genes may help to enhance our understanding of the molecular mechanisms underlying metastasis

  11. The Cinnamyl Alcohol Dehydrogenase Gene Family in Melon (Cucumis melo L.): Bioinformatic Analysis and Expression Patterns

    OpenAIRE

    Jin, Yazhong; Zhang, Chong; Liu, Wei; Qi, Hongyan; Chen, Hao; Cao, Songxiao

    2014-01-01

    Cinnamyl alcohol dehydrogenase (CAD) is a key enzyme in lignin biosynthesis. However, little was known about CADs in melon. Five CAD-like genes were identified in the genome of melons, namely CmCAD1 to CmCAD5. The signal peptides analysis and CAD proteins prediction showed no typical signal peptides were found in all CmCADs and CmCAD proteins may locate in the cytoplasm. Multiple alignments implied that some motifs may be responsible for the high specificity of these CAD proteins, and may be ...

  12. Novel C16orf57 mutations in patients with Poikiloderma with Neutropenia: bioinformatic analysis of the protein and predicted effects of all reported mutations

    Directory of Open Access Journals (Sweden)

    Colombo Elisa A

    2012-01-01

    Full Text Available Abstract Background Poikiloderma with Neutropenia (PN is a rare autosomal recessive genodermatosis caused by C16orf57 mutations. To date 17 mutations have been identified in 31 PN patients. Results We characterize six PN patients expanding the clinical phenotype of the syndrome and the mutational repertoire of the gene. We detect the two novel C16orf57 mutations, c.232C>T and c.265+2T>G, as well as the already reported c.179delC, c.531delA and c.693+1G>T mutations. cDNA analysis evidences the presence of aberrant transcripts, and bioinformatic prediction of C16orf57 protein structure gauges the mutations effects on the folded protein chain. Computational analysis of the C16orf57 protein shows two conserved H-X-S/T-X tetrapeptide motifs marking the active site of a two-fold pseudosymmetric structure recalling the 2H phosphoesterase superfamily. Based on this model C16orf57 is likely a 2H-active site enzyme functioning in RNA processing, as a presumptive RNA ligase. According to bioinformatic prediction, all known C16orf57 mutations, including the novel mutations herein described, impair the protein structure by either removing one or both tetrapeptide motifs or by destroying the symmetry of the native folding. Finally, we analyse the geographical distribution of the recurrent mutations that depicts clusters featuring a founder effect. Conclusions In cohorts of patients clinically affected by genodermatoses with overlapping symptoms, the molecular screening of C16orf57 gene seems the proper way to address the correct diagnosis of PN, enabling the syndrome-specific oncosurveillance. The bioinformatic prediction of the C16orf57 protein structure denotes a very basic enzymatic function consistent with a housekeeping function. Detection of aberrant transcripts, also in cells from PN patients carrying early truncated mutations, suggests they might be translatable. Tissue-specific sensitivity to the lack of functionally correct protein accounts for the

  13. Virus Pathogen Database and Analysis Resource (ViPR): A Comprehensive Bioinformatics Database and Analysis Resource for the Coronavirus Research Community

    OpenAIRE

    Yun Zhang; Klem, Edward B.; Wei Jen; Richard H. Scheuermann; Larsen, Christopher N.; Sam Zaremba; Sanjeev Kumar; Pickett, Brett E; Greer, Douglas S.; Zhiping Gu; Guangyu Sun; Liwei Zhou; Lucy Stewart

    2012-01-01

    Several viruses within the Coronaviridae family have been categorized as either emerging or re-emerging human pathogens, with Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV) being the most well known. The NIAID-sponsored Virus Pathogen Database and Analysis Resource (ViPR, www.viprbrc.org) supports bioinformatics workflows for a broad range of human virus pathogens and other related viruses, including the entire Coronaviridae family. ViPR provides access to sequence records, gene and...

  14. Bioinformatics analysis of human prohibitin%人抗增殖蛋白1生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    陈晨; 赵小峰

    2015-01-01

    Objective To perform the bioinformatics analysis for predicting the structure and function of human pro-hibitin 1(PHB1) to lay the foundation for its functional research and application. Methods The bioinformatics tools were used to predict the chromosome location,transmembrane region,spatial structure,physical and chemical properties and functional regions of PHB1. Results The bioinformatic analysis revealed that PHB1 was composed of 272 amino acids,in which the alanine content was highest;the theoretical isoelectric point was 5.57,the molecular formula was C1331H2154N370O400S2 with a relative molecular mass of 29 804.1. PHB1 protein was a non-transmembrane hydrophobin,which was constituted by alpha-helix. Conclusion Human PHB1 is a superfamily member of cellular membrane protein ,plays the corresponding biological function and also participate in the occurrence and development of many human diseases.%目的:对人抗增殖蛋白1(PHB1)进行生物信息学分析,预测其结构和功能,为人PHB1的功能研究和利用奠定基础。方法利用生物信息学工具对人PHB1的染色体定位、跨膜区域、空间结构、理化性质和功能区进行预测。结果人PHB1由272个氨基酸组成,其中丙氨酸含量最高。该蛋白等电点为5.57,相对分子质量为29804.1,分子式为C1331H2154N370O400S2。该蛋白为非跨膜的疏水蛋白,主要由α-螺旋构成。结论人PHB1为细胞膜蛋白超家族成员,发挥相应的生物学功能,也参与了人类许多疾病的发生、发展过程。

  15. Deep Artificial Neural Networks and Neuromorphic Chips for Big Data Analysis: Pharmaceutical and Bioinformatics Applications.

    Science.gov (United States)

    Pastur-Romay, Lucas Antón; Cedrón, Francisco; Pazos, Alejandro; Porto-Pazos, Ana Belén

    2016-01-01

    Over the past decade, Deep Artificial Neural Networks (DNNs) have become the state-of-the-art algorithms in Machine Learning (ML), speech recognition, computer vision, natural language processing and many other tasks. This was made possible by the advancement in Big Data, Deep Learning (DL) and drastically increased chip processing abilities, especially general-purpose graphical processing units (GPGPUs). All this has created a growing interest in making the most of the potential offered by DNNs in almost every field. An overview of the main architectures of DNNs, and their usefulness in Pharmacology and Bioinformatics are presented in this work. The featured applications are: drug design, virtual screening (VS), Quantitative Structure-Activity Relationship (QSAR) research, protein structure prediction and genomics (and other omics) data mining. The future need of neuromorphic hardware for DNNs is also discussed, and the two most advanced chips are reviewed: IBM TrueNorth and SpiNNaker. In addition, this review points out the importance of considering not only neurons, as DNNs and neuromorphic chips should also include glial cells, given the proven importance of astrocytes, a type of glial cell which contributes to information processing in the brain. The Deep Artificial Neuron-Astrocyte Networks (DANAN) could overcome the difficulties in architecture design, learning process and scalability of the current ML methods. PMID:27529225

  16. Deep Artificial Neural Networks and Neuromorphic Chips for Big Data Analysis: Pharmaceutical and Bioinformatics Applications

    Directory of Open Access Journals (Sweden)

    Lucas Antón Pastur-Romay

    2016-08-01

    Full Text Available Over the past decade, Deep Artificial Neural Networks (DNNs have become the state-of-the-art algorithms in Machine Learning (ML, speech recognition, computer vision, natural language processing and many other tasks. This was made possible by the advancement in Big Data, Deep Learning (DL and drastically increased chip processing abilities, especially general-purpose graphical processing units (GPGPUs. All this has created a growing interest in making the most of the potential offered by DNNs in almost every field. An overview of the main architectures of DNNs, and their usefulness in Pharmacology and Bioinformatics are presented in this work. The featured applications are: drug design, virtual screening (VS, Quantitative Structure–Activity Relationship (QSAR research, protein structure prediction and genomics (and other omics data mining. The future need of neuromorphic hardware for DNNs is also discussed, and the two most advanced chips are reviewed: IBM TrueNorth and SpiNNaker. In addition, this review points out the importance of considering not only neurons, as DNNs and neuromorphic chips should also include glial cells, given the proven importance of astrocytes, a type of glial cell which contributes to information processing in the brain. The Deep Artificial Neuron–Astrocyte Networks (DANAN could overcome the difficulties in architecture design, learning process and scalability of the current ML methods.

  17. Entropy-based analysis and bioinformatics-inspired integration of global economic information transfer.

    Directory of Open Access Journals (Sweden)

    Jinkyu Kim

    Full Text Available The assessment of information transfer in the global economic network helps to understand the current environment and the outlook of an economy. Most approaches on global networks extract information transfer based mainly on a single variable. This paper establishes an entirely new bioinformatics-inspired approach to integrating information transfer derived from multiple variables and develops an international economic network accordingly. In the proposed methodology, we first construct the transfer entropies (TEs between various intra- and inter-country pairs of economic time series variables, test their significances, and then use a weighted sum approach to aggregate information captured in each TE. Through a simulation study, the new method is shown to deliver better information integration compared to existing integration methods in that it can be applied even when intra-country variables are correlated. Empirical investigation with the real world data reveals that Western countries are more influential in the global economic network and that Japan has become less influential following the Asian currency crisis.

  18. Cloning and bioinformatic analysis of HSPC016 gene in dermal papilla cells

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    Objective: To clone the full-length cDNA sequence of HSPC016 gene, an aggregative growth related gene in dermal papilla cells (DPC), and analyze its characteristics and predict its biological function. Methods: Rapid amplification of cDNA ends (RACE) technology was entailed to amplify the 5' and 3' sequences of HSPC016. The amplified fragments were TA-cloned, sequenced and spliced together to obtain the full-length cDNA. Its chromosome localization, domain and possible function were analyzed by bioinformatic methods. Results: Two isoforms, 400 bp and 493 bp, were obtained. The gene was mapped on chromosome 3q21. 31, and was conservative on evolution. HSPC016, a 64aa protein, belongs to PD053992 protein family and its functional domain was homologous to T2FA gene. Conclusion: HSPC016 may be related to transcriptional regulation and its protein product may act as a subunit of a transcriptional complex and play a role on DPC growth and differentiation through facilitating or suppressing other genes'transcription within the nucleus.

  19. Deep Artificial Neural Networks and Neuromorphic Chips for Big Data Analysis: Pharmaceutical and Bioinformatics Applications

    Science.gov (United States)

    Pastur-Romay, Lucas Antón; Cedrón, Francisco; Pazos, Alejandro; Porto-Pazos, Ana Belén

    2016-01-01

    Over the past decade, Deep Artificial Neural Networks (DNNs) have become the state-of-the-art algorithms in Machine Learning (ML), speech recognition, computer vision, natural language processing and many other tasks. This was made possible by the advancement in Big Data, Deep Learning (DL) and drastically increased chip processing abilities, especially general-purpose graphical processing units (GPGPUs). All this has created a growing interest in making the most of the potential offered by DNNs in almost every field. An overview of the main architectures of DNNs, and their usefulness in Pharmacology and Bioinformatics are presented in this work. The featured applications are: drug design, virtual screening (VS), Quantitative Structure–Activity Relationship (QSAR) research, protein structure prediction and genomics (and other omics) data mining. The future need of neuromorphic hardware for DNNs is also discussed, and the two most advanced chips are reviewed: IBM TrueNorth and SpiNNaker. In addition, this review points out the importance of considering not only neurons, as DNNs and neuromorphic chips should also include glial cells, given the proven importance of astrocytes, a type of glial cell which contributes to information processing in the brain. The Deep Artificial Neuron–Astrocyte Networks (DANAN) could overcome the difficulties in architecture design, learning process and scalability of the current ML methods. PMID:27529225

  20. Bioinformatic prediction, deep sequencing of microRNAs and expression analysis during phenotypic plasticity in the pea aphid, Acyrthosiphon pisum

    Directory of Open Access Journals (Sweden)

    Leterme Nathalie

    2010-05-01

    Full Text Available Abstract Background Post-transcriptional regulation in eukaryotes can be operated through microRNA (miRNAs mediated gene silencing. MiRNAs are small (18-25 nucleotides non-coding RNAs that play crucial role in regulation of gene expression in eukaryotes. In insects, miRNAs have been shown to be involved in multiple mechanisms such as embryonic development, tissue differentiation, metamorphosis or circadian rhythm. Insect miRNAs have been identified in different species belonging to five orders: Coleoptera, Diptera, Hymenoptera, Lepidoptera and Orthoptera. Results We developed high throughput Solexa sequencing and bioinformatic analyses of the genome of the pea aphid Acyrthosiphon pisum in order to identify the first miRNAs from a hemipteran insect. By combining these methods we identified 149 miRNAs including 55 conserved and 94 new miRNAs. Moreover, we investigated the regulation of these miRNAs in different alternative morphs of the pea aphid by analysing the expression of miRNAs across the switch of reproduction mode. Pea aphid microRNA sequences have been posted to miRBase: http://microrna.sanger.ac.uk/sequences/ Conclusions Our study has identified candidates as putative regulators involved in reproductive polyphenism in aphids and opens new avenues for further functional analyses.

  1. Rapid cloning and bioinformatic analysis of spinach Y chromosome-specific EST sequences

    Indian Academy of Sciences (India)

    Chuan-Liang Deng; Wei-Li Zhang; Ying Cao; Shao-Jing Wang; Shu-Fen Li; Wu-Jun Gao; Long-Dou Lu

    2015-12-01

    The genome of spinach single chromosome complement is about 1000 Mbp, which is the model material to study the molecular mechanisms of plant sex differentiation. The cytological study showed that the biggest spinach chromosome (chromosome 1) was taken as spinach sex chromosome. It had three alleles of sex-related , m and . Many researchers have been trying to clone the sex-determining genes and investigated the molecular mechanism of spinach sex differentiation. However, there are no successful cloned reports about these genes. A new technology combining chromosome microdissection with hybridization-specific amplification (HSA) was adopted. The spinach Y chromosome degenerate oligonucleotide primed-PCR (DOP-PCR) products were hybridized with cDNA of the male spinach flowers in florescence. The female spinach genome was taken as blocker and cDNA library specifically expressed in Y chromosome was constructed. Moreover, expressed sequence tag (EST) sequences in cDNA library were cloned, sequenced and bioinformatics was analysed. There were 63 valid EST sequences obtained in this study. The fragment size was between 53 and 486 bp. BLASTn homologous alignment indicated that 12 EST sequences had homologous sequences of nucleic acids, the rest were new sequences. BLASTx homologous alignment indicated that 16 EST sequences had homologous protein-encoding nucleic acid sequence. The spinach Y chromosome-specific EST sequences laid the foundation for cloning the functional genes, specifically expressed in spinach Y chromosome. Meanwhile, the establishment of the technology system in the research provided a reference for rapid cloning of other biological sex chromosome-specific EST sequences.

  2. Hepatocellular carcinoma associated microRNA expression signature: integrated bioinformatics analysis, experimental validation and clinical significance.

    Science.gov (United States)

    Shi, Ke-Qing; Lin, Zhuo; Chen, Xiang-Jian; Song, Mei; Wang, Yu-Qun; Cai, Yi-Jing; Yang, Nai-Bing; Zheng, Ming-Hua; Dong, Jin-Zhong; Zhang, Lei; Chen, Yong-Ping

    2015-09-22

    microRNA (miRNA) expression profiles varied greatly among current studies due to different technological platforms and small sample size. Systematic and integrative analysis of published datesets that compared the miRNA expression profiles between hepatocellular carcinoma (HCC) tissue and paired adjacent noncancerous liver tissue was performed to determine candidate HCC associated miRNAs. Moreover, we further validated the confirmed miRNAs in a clinical setting using qRT-PCR and Tumor Cancer Genome Atlas (TCGA) dataset. A miRNA integrated-signature of 5 upregulated and 8 downregulated miRNAs was identified from 26 published datesets in HCC using robust rank aggregation method. qRT-PCR demonstrated that miR-93-5p, miR-224-5p, miR-221-3p and miR-21-5p was increased, whereas the expression of miR-214-3p, miR-199a-3p, miR-195-5p, miR-150-5p and miR-145-5p was decreased in the HCC tissues, which was also validated on TCGA dataset. A miRNA based score using LASSO regression model provided a high accuracy for identifying HCC tissue (AUC = 0.982): HCC risk score = 0.180E_miR-221 + 0.0262E_miR-21 - 0.007E_miR-223 - 0.185E_miR-130a. E_miR-n = Log 2 (expression of microRNA n). Furthermore, expression of 5 miRNAs (miR-222, miR-221, miR-21 miR-214 and miR-130a) correlated with pathological tumor grade. Cox regression analysis showed that miR-21 was related with 3-year survival (hazard ratio [HR]: 1.509, 95%CI: 1.079-2.112, P = 0.016) and 5-year survival (HR: 1.416, 95%CI: 1.057-1.897, P = 0.020). However, none of the deregulated miRNAs was related with microscopic vascular invasion. This study provides a basis for further clinical application of miRNAs in HCC. PMID:26231037

  3. An Introduction to Bioinformatics

    Institute of Scientific and Technical Information of China (English)

    SHENGQi-zheng; DeMoorBart

    2004-01-01

    As a newborn interdisciplinary field, bioinformatics is receiving increasing attention from biologists, computer scientists, statisticians, mathematicians and engineers. This paper briefly introduces the birth, importance, and extensive applications of bioinformatics in the different fields of biological research. A major challenge in bioinformatics - the unraveling of gene regulation - is discussed in detail.

  4. Bioinformatic analysis of cis-regulatory interactions between progesterone and estrogen receptors in breast cancer

    Directory of Open Access Journals (Sweden)

    Matloob Khushi

    2014-11-01

    Full Text Available Chromatin factors interact with each other in a cell and sequence-specific manner in order to regulate transcription and a wealth of publically available datasets exists describing the genomic locations of these interactions. Our recently published BiSA (Binding Sites Analyser database contains transcription factor binding locations and epigenetic modifications collected from published studies and provides tools to analyse stored and imported data. Using BiSA we investigated the overlapping cis-regulatory role of estrogen receptor alpha (ERα and progesterone receptor (PR in the T-47D breast cancer cell line. We found that ERα binding sites overlap with a subset of PR binding sites. To investigate further, we re-analysed raw data to remove any biases introduced by the use of distinct tools in the original publications. We identified 22,152 PR and 18,560 ERα binding sites (<5% false discovery rate with 4,358 overlapping regions among the two datasets. BiSA statistical analysis revealed a non-significant overall overlap correlation between the two factors, suggesting that ERα and PR are not partner factors and do not require each other for binding to occur. However, Monte Carlo simulation by Binary Interval Search (BITS, Relevant Distance, Absolute Distance, Jaccard and Projection tests by Genometricorr revealed a statistically significant spatial correlation of binding regions on chromosome between the two factors. Motif analysis revealed that the shared binding regions were enriched with binding motifs for ERα, PR and a number of other transcription and pioneer factors. Some of these factors are known to co-locate with ERα and PR binding. Therefore spatially close proximity of ERα binding sites with PR binding sites suggests that ERα and PR, in general function independently at the molecular level, but that their activities converge on a specific subset of transcriptional targets.

  5. Global secretome analysis identifies novel mediators of bone metastasis

    Institute of Scientific and Technical Information of China (English)

    Mario Andres Blanco; Gary LeRoy; Zia Khan; Ma(s)a Ale(c)kovi(c); Barry M Zee; Benjamin A Garcia; Yibin Kang

    2012-01-01

    Bone is the one of the most common sites of distant metastasis of solid tumors.Secreted proteins are known to influence pathological interactions between metastatic cancer cells and the bone stroma.To comprehensively profile secreted proteins associated with bone metastasis,we used quantitative and non-quantitative mass spectrometry to globally analyze the secretomes of nine cell lines of varying bone metastatic ability from multiple species and cancer types.By comparing the secretomes of parental cells and their bone metastatic derivatives,we identified the secreted proteins that were uniquely associated with bone metastasis in these cell lines.We then incorporated bioinformatic analyses of large clinical metastasis datasets to obtain a list of candidate novel bone metastasis proteins of several functional classes that were strongly associated with both clinical and experimental bone metastasis.Functional validation of selected proteins indicated that in vivo bone metastasis can be promoted by high expression of (1) the salivary cystatins CST1,CST2,and CST4; (2) the plasminogen activators PLAT and PLAU; or (3) the collagen functionality proteins PLOD2 and COL6A1.Overall,our study has uncovered several new secreted mediators of bone metastasis and therefore demonstrated that secretome analysis is a powerful method for identification of novel biomarkers and candidate therapeutic targets.

  6. Global secretome analysis identifies novel mediators of bone metastasis.

    Science.gov (United States)

    Blanco, Mario Andres; LeRoy, Gary; Khan, Zia; Alečković, Maša; Zee, Barry M; Garcia, Benjamin A; Kang, Yibin

    2012-09-01

    Bone is the one of the most common sites of distant metastasis of solid tumors. Secreted proteins are known to influence pathological interactions between metastatic cancer cells and the bone stroma. To comprehensively profile secreted proteins associated with bone metastasis, we used quantitative and non-quantitative mass spectrometry to globally analyze the secretomes of nine cell lines of varying bone metastatic ability from multiple species and cancer types. By comparing the secretomes of parental cells and their bone metastatic derivatives, we identified the secreted proteins that were uniquely associated with bone metastasis in these cell lines. We then incorporated bioinformatic analyses of large clinical metastasis datasets to obtain a list of candidate novel bone metastasis proteins of several functional classes that were strongly associated with both clinical and experimental bone metastasis. Functional validation of selected proteins indicated that in vivo bone metastasis can be promoted by high expression of (1) the salivary cystatins CST1, CST2, and CST4; (2) the plasminogen activators PLAT and PLAU; or (3) the collagen functionality proteins PLOD2 and COL6A1. Overall, our study has uncovered several new secreted mediators of bone metastasis and therefore demonstrated that secretome analysis is a powerful method for identification of novel biomarkers and candidate therapeutic targets. PMID:22688892

  7. E2F, HSF2, and miR-26 in thyroid carcinoma: bioinformatic analysis of RNA-sequencing data.

    Science.gov (United States)

    Lu, J C; Zhang, Y P

    2016-01-01

    In this study, we examined the molecular mechanism of thyroid carcinoma (THCA) using bioinformatics. RNA-sequencing data of THCA (N = 498) and normal thyroid tissue (N = 59) were downloaded from The Cancer Genome Atlas. Next, gene expression levels were calculated using the TCC package and differentially expressed genes (DEGs) were identified using the edgeR package. A co-expression network was constructed using the EBcoexpress package and visualized by Cytoscape, and functional and pathway enrichment of DEGs in the co-expression network was analyzed with DAVID and KOBAS 2.0. Moreover, modules in the co-expression network were identified and annotated using MCODE and BiNGO plugins. Small-molecule drugs were analyzed using the cMAP database, and miRNAs and transcription factors regulating DEGs were identified by WebGestalt. A total of 254 up-regulated and 59 down-regulated DEGs were identified between THCA samples and controls. DEGs enriched in biological process terms were related to cell adhesion, death, and growth and negatively correlated with various small-molecule drugs. The co-expression network of the DEGs consisted of hub genes (ITGA3, TIMP1, KRT19, and SERPINA1) and one module (JUN, FOSB, and EGR1). Furthermore, 5 miRNAs and 5 transcription factors were identified, including E2F, HSF2, and miR-26. miR-26 may participate in THCA by targeting CITED1 and PLA2R1; E2F may participate in THCA by regulating ITGA3, TIMP1, KRT19, EGR1, and JUN; HSF2 may be involved in THCA development by regulating SERPINA1 and FOSB; and small-molecule drugs may have anti-THCA effects. Our results provide novel directions for mechanistic studies and drug design of THCA. PMID:26985959

  8. The Alcohol Dehydrogenase Gene Family in Melon (Cucumis melo L.: Bioinformatic Analysis and Expression Patterns

    Directory of Open Access Journals (Sweden)

    Yazhong eJin

    2016-05-01

    Full Text Available Alcohol dehydrogenases (ADH, encoded by multigene family in plants, play a critical role in plant growth, development, adaptation, fruit ripening and aroma production. Thirteen ADH genes were identified in melon genome, including 12 ADHs and one formaldehyde dehydrogenease (FDH, designated CmADH1-12 and CmFDH1, in which CmADH1 and CmADH2 have been isolated in Cantaloupe. ADH genes shared a lower identity with each other at the protein level and had different intron-exon structure at nucleotide level. No typical signal peptides were found in all CmADHs, and CmADH proteins might locate in the cytoplasm. The phylogenetic tree revealed that 13 ADH genes were divided into 3 groups respectively, namely long-, medium- and short-chain ADH subfamily, and CmADH1,3-11, which belongs to the medium-chain ADH subfamily, fell into 6 medium-chain ADH subgroups. CmADH12 may belong to the long-chain ADH subfamily, while CmFDH1 may be a Class III ADH and serve as an ancestral ADH in melon. Expression profiling revealed that CmADH1, CmADH2, CmADH10 and CmFDH1 were moderately or strongly expressed in different vegetative tissues and fruit at medium and late developmental stages, while CmADH8 and CmADH12 were highly expressed in fruit after 20 days. CmADH3 showed preferential expression in young tissues. CmADH4 only had slight expression in root. Promoter analysis revealed several motifs of CmADH genes involved in the gene expression modulated by various hormones, and the response pattern of CmADH genes to ABA, IAA and ethylene were different. These CmADHs were divided into ethylene-sensitive and –insensitive groups, and the functions of CmADHs were discussed.

  9. The secondary metabolite bioinformatics portal

    DEFF Research Database (Denmark)

    Weber, Tilmann; Kim, Hyun Uk

    2016-01-01

    . In this context, this review gives a summary of tools and databases that currently are available to mine, identify and characterize natural product biosynthesis pathways and their producers based on ‘omics data. A web portal called Secondary Metabolite Bioinformatics Portal (SMBP at http...

  10. Analysis of ultra-deep pyrosequencing and cloning based sequencing of the basic core promoter/precore/core region of hepatitis B virus using newly developed bioinformatics tools.

    Directory of Open Access Journals (Sweden)

    Mukhlid Yousif

    Full Text Available AIMS: The aims of this study were to develop bioinformatics tools to explore ultra-deep pyrosequencing (UDPS data, to test these tools, and to use them to determine the optimum error threshold, and to compare results from UDPS and cloning based sequencing (CBS. METHODS: Four serum samples, infected with either genotype D or E, from HBeAg-positive and HBeAg-negative patients were randomly selected. UDPS and CBS were used to sequence the basic core promoter/precore region of HBV. Two online bioinformatics tools, the "Deep Threshold Tool" and the "Rosetta Tool" (http://hvdr.bioinf.wits.ac.za/tools/, were built to test and analyze the generated data. RESULTS: A total of 10952 reads were generated by UDPS on the 454 GS Junior platform. In the four samples, substitutions, detected at 0.5% threshold or above, were identified at 39 unique positions, 25 of which were non-synonymous mutations. Sample #2 (HBeAg-negative, genotype D had substitutions in 26 positions, followed by sample #1 (HBeAg-negative, genotype E in 12 positions, sample #3 (HBeAg-positive, genotype D in 7 positions and sample #4 (HBeAg-positive, genotype E in only four positions. The ratio of nucleotide substitutions between isolates from HBeAg-negative and HBeAg-positive patients was 3.5 ∶ 1. Compared to genotype E isolates, genotype D isolates showed greater variation in the X, basic core promoter/precore and core regions. Only 18 of the 39 positions identified by UDPS were detected by CBS, which detected 14 of the 25 non-synonymous mutations detected by UDPS. CONCLUSION: UDPS data should be approached with caution. Appropriate curation of read data is required prior to analysis, in order to clean the data and eliminate artefacts. CBS detected fewer than 50% of the substitutions detected by UDPS. Furthermore it is important that the appropriate consensus (reference sequence is used in order to identify variants correctly.

  11. Analysis of Ultra-Deep Pyrosequencing and Cloning Based Sequencing of the Basic Core Promoter/Precore/Core Region of Hepatitis B Virus Using Newly Developed Bioinformatics Tools

    Science.gov (United States)

    Yousif, Mukhlid; Bell, Trevor G.; Mudawi, Hatim; Glebe, Dieter; Kramvis, Anna

    2014-01-01

    Aims The aims of this study were to develop bioinformatics tools to explore ultra-deep pyrosequencing (UDPS) data, to test these tools, and to use them to determine the optimum error threshold, and to compare results from UDPS and cloning based sequencing (CBS). Methods Four serum samples, infected with either genotype D or E, from HBeAg-positive and HBeAg-negative patients were randomly selected. UDPS and CBS were used to sequence the basic core promoter/precore region of HBV. Two online bioinformatics tools, the “Deep Threshold Tool” and the “Rosetta Tool” (http://hvdr.bioinf.wits.ac.za/tools/), were built to test and analyze the generated data. Results A total of 10952 reads were generated by UDPS on the 454 GS Junior platform. In the four samples, substitutions, detected at 0.5% threshold or above, were identified at 39 unique positions, 25 of which were non-synonymous mutations. Sample #2 (HBeAg-negative, genotype D) had substitutions in 26 positions, followed by sample #1 (HBeAg-negative, genotype E) in 12 positions, sample #3 (HBeAg-positive, genotype D) in 7 positions and sample #4 (HBeAg-positive, genotype E) in only four positions. The ratio of nucleotide substitutions between isolates from HBeAg-negative and HBeAg-positive patients was 3.5∶1. Compared to genotype E isolates, genotype D isolates showed greater variation in the X, basic core promoter/precore and core regions. Only 18 of the 39 positions identified by UDPS were detected by CBS, which detected 14 of the 25 non-synonymous mutations detected by UDPS. Conclusion UDPS data should be approached with caution. Appropriate curation of read data is required prior to analysis, in order to clean the data and eliminate artefacts. CBS detected fewer than 50% of the substitutions detected by UDPS. Furthermore it is important that the appropriate consensus (reference) sequence is used in order to identify variants correctly. PMID:24740330

  12. Cancer bioinformatics: detection of chromatin states,SNP-containing motifs, and functional enrichment modules

    Institute of Scientific and Technical Information of China (English)

    Xiaobo Zhou

    2013-01-01

    In this editorial preface,I briefly review cancer bioinformatics and introduce the four articles in this special issue highlighting important applications of the field:detection of chromatin states; detection of SNP-containing motifs and association with transcription factor-binding sites; improvements in functional enrichment modules; and gene association studies on aging and cancer.We expect this issue to provide bioinformatics scientists,cancer biologists,and clinical doctors with a better understanding of how cancer bioinformatics can be used to identify candidate biomarkers and targets and to conduct functional analysis.

  13. Cancer bioinformatics: detection of chromatin states, SNP-containing motifs, and functional enrichment modules

    Directory of Open Access Journals (Sweden)

    Xiaobo Zhou

    2013-04-01

    Full Text Available In this editorial preface, I briefly review cancer bioinformatics and introduce the four articles in this special issue highlighting important applications of the field: detection of chromatin states; detection of SNP-containing motifs and association with transcription factor-binding sites; improvements in functional enrichment modules; and gene association studies on aging and cancer. We expect this issue to provide bioinformatics scientists, cancer biologists, and clinical doctors with a better understanding of how cancer bioinformatics can be used to identify candidate biomarkers and targets and to conduct functional analysis.

  14. Genomic and Bioinformatics Analysis of HAdV-4, a Human Adenovirus Causing Acute Respiratory Disease: Implications for Gene Therapy and Vaccine Vector Development

    OpenAIRE

    Purkayastha, Anjan; Ditty, Susan E.; Su, Jing; McGraw, John; Hadfield, Ted L.; Tibbetts, Clark; Seto, Donald

    2005-01-01

    Human adenovirus serotype 4 (HAdV-4) is a reemerging viral pathogenic agent implicated in epidemic outbreaks of acute respiratory disease (ARD). This report presents a genomic and bioinformatics analysis of the prototype 35,990-nucleotide genome (GenBank accession no. AY594253). Intriguingly, the genome analysis suggests a closer phylogenetic relationship with the chimpanzee adenoviruses (simian adenoviruses) rather than with other human adenoviruses, suggesting a recent origin of HAdV-4, and...

  15. Identifiability analysis in conceptual sewer modelling.

    Science.gov (United States)

    Kleidorfer, M; Leonhardt, G; Rauch, W

    2012-01-01

    For a sufficient calibration of an environmental model not only parameter sensitivity but also parameter identifiability is an important issue. In identifiability analysis it is possible to analyse whether changes in one parameter can be compensated by appropriate changes of the other ones within a given uncertainty range. Parameter identifiability is conditional to the information content of the calibration data and consequently conditional to a certain measurement layout (i.e. types of measurements, number and location of measurement sites, temporal resolution of measurements etc.). Hence the influence of number and location of measurement sites on the number of identifiable parameters can be investigated. In the present study identifiability analysis is applied to a conceptual model of a combined sewer system aiming to predict the combined sewer overflow emissions. Different measurement layouts are tested and it can be shown that only 13 of the most sensitive catchment areas (represented by the model parameter 'effective impervious area') can be identified when overflow measurements of the 20 highest overflows and the runoff to the waste water treatment plant are used for calibration. The main advantage of this method is very low computational costs as the number of required model runs equals the total number of model parameters. Hence, this method is a valuable tool when analysing large models with a long runtime and many parameters. PMID:22864432

  16. Bioinformatics and genomic medicine.

    Science.gov (United States)

    Kim, Ju Han

    2002-01-01

    Bioinformatics is a rapidly emerging field of biomedical research. A flood of large-scale genomic and postgenomic data means that many of the challenges in biomedical research are now challenges in computational science. Clinical informatics has long developed methodologies to improve biomedical research and clinical care by integrating experimental and clinical information systems. The informatics revolution in both bioinformatics and clinical informatics will eventually change the current practice of medicine, including diagnostics, therapeutics, and prognostics. Postgenome informatics, powered by high-throughput technologies and genomic-scale databases, is likely to transform our biomedical understanding forever, in much the same way that biochemistry did a generation ago. This paper describes how these technologies will impact biomedical research and clinical care, emphasizing recent advances in biochip-based functional genomics and proteomics. Basic data preprocessing with normalization and filtering, primary pattern analysis, and machine-learning algorithms are discussed. Use of integrative biochip informatics technologies, including multivariate data projection, gene-metabolic pathway mapping, automated biomolecular annotation, text mining of factual and literature databases, and the integrated management of biomolecular databases, are also discussed. PMID:12544491

  17. Analysis of RNAseq datasets from a comparative infectious disease zebrafish model using GeneTiles bioinformatics

    NARCIS (Netherlands)

    Veneman, W.J.; De Sonneville, J.; Van der Kolk, K.J.; Ordas, A.; Al-Ars, Z.; Meijer, A.H.; Spaink, M.P.

    2014-01-01

    We present a RNA deep sequencing (RNAseq) analysis of a comparison of the transcriptome responses to infection of zebrafish larvae with Staphylococcus epidermidis and Mycobacterium marinum bacteria. We show how our developed GeneTiles software can improve RNAseq analysis approaches by more confident

  18. A bioinformatics analysis of Lamin-A regulatory network: a perspective on epigenetic involvement in Hutchinson-Gilford progeria syndrome.

    Science.gov (United States)

    Arancio, Walter

    2012-04-01

    Hutchinson-Gilford progeria syndrome (HGPS) is a rare human genetic disease that leads to premature aging. HGPS is caused by mutation in the Lamin-A (LMNA) gene that leads, in affected young individuals, to the accumulation of the progerin protein, usually present only in aging differentiated cells. Bioinformatics analyses of the network of interactions of the LMNA gene and transcripts are presented. The LMNA gene network has been analyzed using the BioGRID database (http://thebiogrid.org/) and related analysis tools such as Osprey (http://biodata.mshri.on.ca/osprey/servlet/Index) and GeneMANIA ( http://genemania.org/). The network of interaction of LMNA transcripts has been further analyzed following the competing endogenous (ceRNA) hypotheses (RNA cross-talk via microRNAs [miRNAs]) and using the miRWalk database and tools (www.ma.uni-heidelberg.de/apps/zmf/mirwalk/). These analyses suggest particular relevance of epigenetic modifiers (via acetylase complexes and specifically HTATIP histone acetylase) and adenosine triphosphate (ATP)-dependent chromatin remodelers (via pBAF, BAF, and SWI/SNF complexes). PMID:22533413

  19. Ready to use bioinformatics analysis as a tool to predict immobilisation strategies for protein direct electron transfer (DET).

    Science.gov (United States)

    Cazelles, R; Lalaoui, N; Hartmann, T; Leimkühler, S; Wollenberger, U; Antonietti, M; Cosnier, S

    2016-11-15

    Direct electron transfer (DET) to proteins is of considerable interest for the development of biosensors and bioelectrocatalysts. While protein structure is mainly used as a method of attaching the protein to the electrode surface, we employed bioinformatics analysis to predict the suitable orientation of the enzymes to promote DET. Structure similarity and secondary structure prediction were combined underlying localized amino-acids able to direct one of the enzyme's electron relays toward the electrode surface by creating a suitable bioelectrocatalytic nanostructure. The electro-polymerization of pyrene pyrrole onto a fluorine-doped tin oxide (FTO) electrode allowed the targeted orientation of the formate dehydrogenase enzyme from Rhodobacter capsulatus (RcFDH) by means of hydrophobic interactions. Its electron relays were directed to the FTO surface, thus promoting DET. The reduction of nicotinamide adenine dinucleotide (NAD(+)) generating a maximum current density of 1μAcm(-2) with 10mM NAD(+) leads to a turnover number of 0.09electron/s/molRcFDH. This work represents a practical approach to evaluate electrode surface modification strategies in order to create valuable bioelectrocatalysts. PMID:27156017

  20. Visualising "Junk" DNA through Bioinformatics

    Science.gov (United States)

    Elwess, Nancy L.; Latourelle, Sandra M.; Cauthorn, Olivia

    2005-01-01

    One of the hottest areas of science today is the field in which biology, information technology,and computer science are merged into a single discipline called bioinformatics. This field enables the discovery and analysis of biological data, including nucleotide and amino acid sequences that are easily accessed through the use of computers. As…

  1. Bioinformatics and the Undergraduate Curriculum

    Science.gov (United States)

    Maloney, Mark; Parker, Jeffrey; LeBlanc, Mark; Woodard, Craig T.; Glackin, Mary; Hanrahan, Michael

    2010-01-01

    Recent advances involving high-throughput techniques for data generation and analysis have made familiarity with basic bioinformatics concepts and programs a necessity in the biological sciences. Undergraduate students increasingly need training in methods related to finding and retrieving information stored in vast databases. The rapid rise of…

  2. Identification of complex metabolic states in critically injured patients using bioinformatic cluster analysis

    OpenAIRE

    Cohen, Mitchell J; Grossman, Adam D; Morabito, Diane; Knudson, M. Margaret; Butte, Atul J; Manley, Geoffrey T.

    2010-01-01

    Introduction Advances in technology have made extensive monitoring of patient physiology the standard of care in intensive care units (ICUs). While many systems exist to compile these data, there has been no systematic multivariate analysis and categorization across patient physiological data. The sheer volume and complexity of these data make pattern recognition or identification of patient state difficult. Hierarchical cluster analysis allows visualization of high dimensional data and enabl...

  3. Deep Learning in Bioinformatics

    OpenAIRE

    Min, Seonwoo; Lee, Byunghan; Yoon, Sungroh

    2016-01-01

    In the era of big data, transformation of biomedical big data into valuable knowledge has been one of the most important challenges in bioinformatics. Deep learning has advanced rapidly since the early 2000s and now demonstrates state-of-the-art performance in various fields. Accordingly, application of deep learning in bioinformatics to gain insight from data has been emphasized in both academia and industry. Here, we review deep learning in bioinformatics, presenting examples of current res...

  4. Antimicrobial Protein Candidates from the Thermophilic Geobacillus sp. Strain ZGt-1: Production, Proteomics, and Bioinformatics Analysis

    Science.gov (United States)

    Alkhalili, Rawana N.; Bernfur, Katja; Dishisha, Tarek; Mamo, Gashaw; Schelin, Jenny; Canbäck, Björn; Emanuelsson, Cecilia; Hatti-Kaul, Rajni

    2016-01-01

    A thermophilic bacterial strain, Geobacillus sp. ZGt-1, isolated from Zara hot spring in Jordan, was capable of inhibiting the growth of the thermophilic G. stearothermophilus and the mesophilic Bacillus subtilis and Salmonella typhimurium on a solid cultivation medium. Antibacterial activity was not observed when ZGt-1 was cultivated in a liquid medium; however, immobilization of the cells in agar beads that were subjected to sequential batch cultivation in the liquid medium at 60 °C showed increasing antibacterial activity up to 14 cycles. The antibacterial activity was lost on protease treatment of the culture supernatant. Concentration of the protein fraction by ammonium sulphate precipitation followed by denaturing polyacrylamide gel electrophoresis separation and analysis of the gel for antibacterial activity against G. stearothermophilus showed a distinct inhibition zone in 15–20 kDa range, suggesting that the active molecule(s) are resistant to denaturation by SDS. Mass spectrometric analysis of the protein bands around the active region resulted in identification of 22 proteins with molecular weight in the range of interest, three of which were new and are here proposed as potential antimicrobial protein candidates by in silico analysis of their amino acid sequences. Mass spectrometric analysis also indicated the presence of partial sequences of antimicrobial enzymes, amidase and dd-carboxypeptidase. PMID:27548162

  5. Antimicrobial Protein Candidates from the Thermophilic Geobacillus sp. Strain ZGt-1: Production, Proteomics, and Bioinformatics Analysis.

    Science.gov (United States)

    Alkhalili, Rawana N; Bernfur, Katja; Dishisha, Tarek; Mamo, Gashaw; Schelin, Jenny; Canbäck, Björn; Emanuelsson, Cecilia; Hatti-Kaul, Rajni

    2016-01-01

    A thermophilic bacterial strain, Geobacillus sp. ZGt-1, isolated from Zara hot spring in Jordan, was capable of inhibiting the growth of the thermophilic G. stearothermophilus and the mesophilic Bacillus subtilis and Salmonella typhimurium on a solid cultivation medium. Antibacterial activity was not observed when ZGt-1 was cultivated in a liquid medium; however, immobilization of the cells in agar beads that were subjected to sequential batch cultivation in the liquid medium at 60 °C showed increasing antibacterial activity up to 14 cycles. The antibacterial activity was lost on protease treatment of the culture supernatant. Concentration of the protein fraction by ammonium sulphate precipitation followed by denaturing polyacrylamide gel electrophoresis separation and analysis of the gel for antibacterial activity against G. stearothermophilus showed a distinct inhibition zone in 15-20 kDa range, suggesting that the active molecule(s) are resistant to denaturation by SDS. Mass spectrometric analysis of the protein bands around the active region resulted in identification of 22 proteins with molecular weight in the range of interest, three of which were new and are here proposed as potential antimicrobial protein candidates by in silico analysis of their amino acid sequences. Mass spectrometric analysis also indicated the presence of partial sequences of antimicrobial enzymes, amidase, and dd-carboxypeptidase. PMID:27548162

  6. Identification and bioinformatics analysis of microRNAs from the sporophyte and gametophyte of Pyropia haitanensis

    Science.gov (United States)

    Huang, Aiyou; Wang, Guangce

    2016-05-01

    Pyropia haitanensis (T. J. Chang et B. F. Zheng) N. Kikuchi et M. Miyata ( Porphyra haitanensis) is an economically important genus that is cultured widely in China. P. haitanensis is cultured on a larger scale than Pyropia yezoensis, making up an important part of the total production of cultivated Pyropia in China. However, the majority of molecular mechanisms underlying the physiological processes of P. haitanensis remain unknown. P. haitanensis could utilize inorganic carbon and the sporophytes of P. haitanensis might possess a PCK-type C4-like carbon-fixation pathway. To identify microRNAs and their probable roles in sporophyte and gametophyte development, we constructed and sequenced small RNA libraries from sporophytes and gametophytes of P. haitanensis. Five microRNAs were identified that shared no sequence homology with known microRNAs. Our results indicated that P. haitanensis might posses a complex sRNA processing system in which the novel microRNAs act as important regulators of the development of different generations of P. haitanensis.

  7. Identification and bioinformatics analysis of microRNAs from the sporophyte and gametophyte of Pyropia haitanensis

    Science.gov (United States)

    Huang, Aiyou; Wang, Guangce

    2015-09-01

    Pyropia haitanensis (T. J. Chang et B. F. Zheng) N. Kikuchi et M. Miyata (Porphyra haitanensis) is an economically important genus that is cultured widely in China. P. haitanensis is cultured on a larger scale than Pyropia yezoensis, making up an important part of the total production of cultivated Pyropia in China. However, the majority of molecular mechanisms underlying the physiological processes of P. haitanensis remain unknown. P. haitanensis could utilize inorganic carbon and the sporophytes of P. haitanensis might possess a PCK-type C4-like carbon-fixation pathway. To identify microRNAs and their probable roles in sporophyte and gametophyte development, we constructed and sequenced small RNA libraries from sporophytes and gametophytes of P. haitanensis. Five microRNAs were identified that shared no sequence homology with known microRNAs. Our results indicated that P. haitanensis might posses a complex sRNA processing system in which the novel microRNAs act as important regulators of the development of different generations of P. haitanensis.

  8. Cloning, identification, and bioinformatics analysis of a putative aquaporin TsAQP from Trichinella spiralis.

    Science.gov (United States)

    Cui, J M; Zhang, N Z; Li, W H; Yan, H B; Fu, B Q

    2015-01-01

    Vaccination as a preventative strategy against Trichinella spiralis infection is an ongoing effort, although no ideal vaccine candidates have been identified until now. Identification of more effective antigens that have a role in essential life stages of the parasite and that may be effective vaccine candidates is therefore of importance. In the present study, we identified a novel aquaporin gene (TsAQP) from T. spiralis, and the potential antigenicity of TsAQP was evaluated by epitope prediction. A total of 11 post-translational modification sites were predicted in the protein and fell into 4 categories: N-glycosylation; casein kinase II phosphorylation; protein kinase C phosphorylation; and N-myristoylation sites. TsAQP is a membrane intrinsic protein with high hydrophobicity; the main hydrophobic domains comprised up to 38.5% of the protein and were distributed at amino acid positions 21-43, 54-71, 83-91, 107-121, 163-174, 187-200, and 242-261. The protein consisted mainly of helices (39.58%) and loops (50%). The advanced structure of TsAQP was predicted using homology modeling, which showed that the protein was formed from 6 membrane-spanning domains connected by 5 loops. Based on these analyses, 6 potential B-cell epitopes and 4 potential T-cell epitopes were further predicted. These results suggest that TsAQP could be a promising antigen candidate for vaccination against T. spiralis. PMID:26505421

  9. Identification and analysis of miRNAs and their targets in ginger using bioinformatics approach.

    Science.gov (United States)

    Singh, Noopur; Srivastava, Swati; Sharma, Ashok

    2016-01-10

    MicroRNAs (miRNAs) are a large family of endogenous small RNAs derived from the non-protein coding genes. miRNA regulates the gene expression at the post-transcriptional level and plays an important role in plant development. Zingiber officinale is an important medicinal plant having numerous therapeutic properties. Its bioactive compound gingerol and essential oil posses important pharmacological and physiological activities. In this study, we used a homology search based computational approach for identifying miRNAs in Z. officinale. A total of 16 potential miRNA families (miR167, miR407, miR414, miR5015, miR5021, miR5644, miR5645, miR5656, miR5658, miR5664, miR827, miR838, miR847, miR854, miR862 and miR864) were predicted in ginger. Phylogenetic and conserved analyses were performed for predicted miRNAs. Thirteen miRNA families were found to regulate 300 target transcripts and play an important role in cell signaling, reproduction, metabolic process and stress. To understand the miRNA mediated gene regulatory control and to validate miRNA target predictions, a biological network was also constructed. Gene ontology and pathway analyses were also done. miR5015 was observed to regulate the biosynthesis of gingerol by inhibiting phenyl ammonia lyase (PAL), a precursor enzyme in the biosynthesis of gingerol. Our results revealed that most of the predicted miRNAs were involved in the regulation of rhizome development. miR5021, miR854 and miR838 were identified to regulate the rhizome development and the essential oil biosynthesis in ginger. PMID:26392033

  10. Toward the Replacement of Animal Experiments through the Bioinformatics-driven Analysis of 'Omics' Data from Human Cell Cultures.

    Science.gov (United States)

    Grafström, Roland C; Nymark, Penny; Hongisto, Vesa; Spjuth, Ola; Ceder, Rebecca; Willighagen, Egon; Hardy, Barry; Kaski, Samuel; Kohonen, Pekka

    2015-11-01

    This paper outlines the work for which Roland Grafström and Pekka Kohonen were awarded the 2014 Lush Science Prize. The research activities of the Grafström laboratory have, for many years, covered cancer biology studies, as well as the development and application of toxicity-predictive in vitro models to determine chemical safety. Through the integration of in silico analyses of diverse types of genomics data (transcriptomic and proteomic), their efforts have proved to fit well into the recently-developed Adverse Outcome Pathway paradigm. Genomics analysis within state-of-the-art cancer biology research and Toxicology in the 21st Century concepts share many technological tools. A key category within the Three Rs paradigm is the Replacement of animals in toxicity testing with alternative methods, such as bioinformatics-driven analyses of data obtained from human cell cultures exposed to diverse toxicants. This work was recently expanded within the pan-European SEURAT-1 project (Safety Evaluation Ultimately Replacing Animal Testing), to replace repeat-dose toxicity testing with data-rich analyses of sophisticated cell culture models. The aims and objectives of the SEURAT project have been to guide the application, analysis, interpretation and storage of 'omics' technology-derived data within the service-oriented sub-project, ToxBank. Particularly addressing the Lush Science Prize focus on the relevance of toxicity pathways, a 'data warehouse' that is under continuous expansion, coupled with the development of novel data storage and management methods for toxicology, serve to address data integration across multiple 'omics' technologies. The prize winners' guiding principles and concepts for modern knowledge management of toxicological data are summarised. The translation of basic discovery results ranged from chemical-testing and material-testing data, to information relevant to human health and environmental safety. PMID:26551289

  11. BIOINFORMATICS AND BIOSYNTHESIS ANALYSIS OF CELLULOSE SYNTHASE OPERON IN ZYMOMONAS MOBILIS ZM4

    OpenAIRE

    Sheik Abdul Kader Sheik Asraf, K. Narayanan Rajnish, and Paramasamy Gunasekaran

    2011-01-01

    Biosynthesis of cellulose has been reported in many species of bacteria. The genes encoding cellulose biosynthetic enzymes of Z. mobilis have not been studied so far. Preliminary sequence analysis of the Z. mobilis ZM4 genome revealed the presence of a cellulose synthase operon comprised of Open Reading Frames (ORFs) ZMO01083 (bcsA), ZMO1084 (bcsB) and ZMO1085 (bcsC). The first gene of the operon bcsA encodes the cellulose synthase catalytic subunit BcsA. The second gene of the operon bcsB en...

  12. The haloarchaeal MCM proteins: bioinformatic analysis and targeted mutagenesis of the β7-β8 and β9-β10 hairpin loops and conserved zinc binding domain cysteines

    Directory of Open Access Journals (Sweden)

    Tatjana P Kristensen

    2014-03-01

    Full Text Available The hexameric MCM complex is the catalytic core of the replicative helicase in eukaryotic and archaeal cells. Here we describe the first in vivo analysis of archaeal MCM protein structure and function relationships using the genetically tractable haloarchaeon Haloferax volcanii as a model system. Hfx. volcanii encodes a single MCM protein that is part of the previously identified core group of haloarchaeal MCM proteins. Three structural features of the N-terminal domain of the Hfx. volcanii MCM protein were targeted for mutagenesis: the β7-β8 and β9-β10 β-hairpin loops and putative zinc binding domain. Five strains carrying single point mutations in the β7-β8 β-hairpin loop were constructed, none of which displayed impaired cell growth under normal conditions or when treated with the DNA damaging agent mitomycin C. However, short sequence deletions within the β7-β8 β-hairpin were not tolerated and neither was replacement of the highly conserved residue glutamate 187 with alanine. Six strains carrying paired alanine substitutions within the β9-β10 β-hairpin loop were constructed, leading to the conclusion that no individual amino acid within that hairpin loop is absolutely required for MCM function, although one of the mutant strains displays greatly enhanced sensitivity to mitomycin C. Deletions of two or four amino acids from the β9-β10 β-hairpin were tolerated but mutants carrying larger deletions were inviable. Similarly, it was not possible to construct mutants in which any of the conserved zinc binding cysteines was replaced with alanine, underlining the likely importance of zinc binding for MCM function. The results of these studies demonstrate the feasibility of using Hfx. volcanii as a model system for reverse genetic analysis of archaeal MCM protein function and provide important confirmation of the in vivo importance of conserved structural features identified by previous bioinformatic, biochemical and structural

  13. Bioinformatics analysis of organizational and expressional characterizations of the IFNs, IRFs and CRFBs in grass carp Ctenopharyngodon idella.

    Science.gov (United States)

    Liao, Zhiwei; Wan, Quanyuan; Su, Jianguo

    2016-08-01

    Interferons (IFNs) play crucial roles in the immune response of defense against viral infection and bacteria invasion. In the present study, we systematically identified and characterized the IFNs, their regulatory factors (Interferon Regulatory Factors, IRFs) and receptors (Cytokine Receptor Family B, CRFBs) in grass carp (Ctenopharyngodon idella). Grass carp IFNs can be classified into type I IFN (IFN-I) and type II IFN (IFN-II) like other teleosts. IFN-I consist of two groups with two (group I) or four (group II) cysteines in the mature peptide and can be further divided into three subgroups (IFN-a, -c and -d), containing four members: IFN1, IFN2, IFN3, IFN4 in grass carp. IFN-II contain two members, IFNγ2 with the similarity to mammalian IFNγ and a cyprinid specific IFNγ1 (IFNγ-rel) molecule. mRNA expression analyses of IFNs discovered that IFN1 and IFN-II were sustainably expressed in many tissues, while other IFN members were transiently expressed in specific tissues and time points. In the immune response, IFN transcriptions are primarily regulated through multiple IRFs after grass carp reovirus (GCRV) challenge. IRF family possess thirteen members in grass carp, which can be further divided into four subfamilies (IRF-1, -3, -4 and -5 subfamily), each of them plays different roles in the innate and adaptive immunity via various signaling pathways to interact with IFNs (mainly IFN-I). IFNs have to bind receptors (CRFBs) to perform their functions. CRFBs as IFN receptors contain six members in grass carp. The structure and expression characterizations of IFNs, IRFs and CRFBs were analyzed using bioinformatics tools. These results might provide basic data for the further functional research of IFN system, and deeply understand fish immune mechanisms against virus infection. PMID:27012995

  14. Molecular mechanisms associated with breast cancer based on integrated gene expression profiling by bioinformatics analysis.

    Science.gov (United States)

    Wu, Di; Han, Bing; Guo, Liang; Fan, Zhimin

    2016-07-01

    In this study, we aimed to gain more insights into the underlying molecular mechanisms responsible for breast cancer (BC) progression. Three gene expression profiles of human BC were integrated and used to screen the differentially expressed genes (DEGs) between healthy breast samples and BC samples. Protein-protein interaction (PPI) network of DEGs was constructed by mapping DEGs into the Search Tool for the Retrieval of Interacting Genes (STRING) database; then the subnetworks of PPI were constructed with plug-in, MCODE and DEGs in Subnetwork 1 were analysed based on Kyoto Encyclopaedia of Genes and Genomes (KEGG) pathway database ( http://www.genome.jp/kegg /). In addition, co-expression network of DEGs was established using the Cytoscape. Totalally 931 DEGs were selected, including 340 up-regulated genes and 591 down-regulated genes. KEGG pathway analysis for DEGs in Subnetwork 1 showed that the pathogenesis of BC was associated with cell cycle, oocyte meiosis, progesterone-mediated oocyte maturation and p53 signalling pathways. Meanwhile, the most significant-related DEGs were found by co-expression network analysis of DEGs. In conclusion, CCNG1 might be involved in the progression of BC via inhibiting cell proliferation, and ADAMTS1 might play a crucial role in BC development through the regulation of angiogenesis. PMID:26804550

  15. Bioinformatic analysis reveals high diversity of bacterial genes for laccase-like enzymes.

    Directory of Open Access Journals (Sweden)

    Luka Ausec

    Full Text Available Fungal laccases have been used in various fields ranging from processes in wood and paper industries to environmental applications. Although a few bacterial laccases have been characterized in recent years, prokaryotes have largely been neglected as a source of novel enzymes, in part due to the lack of knowledge about the diversity and distribution of laccases within Bacteria. In this work genes for laccase-like enzymes were searched for in over 2,200 complete and draft bacterial genomes and four metagenomic datasets, using the custom profile Hidden Markov Models for two- and three-domain laccases. More than 1,200 putative genes for laccase-like enzymes were retrieved from chromosomes and plasmids of diverse bacteria. In 76% of the genes, signal peptides were predicted, indicating that these bacterial laccases may be exported from the cytoplasm, which contrasts with the current belief. Moreover, several examples of putatively horizontally transferred bacterial laccase genes were described. Many metagenomic sequences encoding fragments of laccase-like enzymes could not be phylogenetically assigned, indicating considerable novelty. Laccase-like genes were also found in anaerobic bacteria, autotrophs and alkaliphiles, thus opening new hypotheses regarding their ecological functions. Bacteria identified as carrying laccase genes represent potential sources for future biotechnological applications.

  16. Comparative proteomic and bioinformatic analysis of Theileria luwenshuni and Theileria uilenbergi.

    Science.gov (United States)

    Zhang, Xiao; Li, Youquan; Chen, Ze; Liu, Zhijie; Ren, Qiaoyun; Yang, Jifei; Zhu, Xinquan; Guan, Guiquan; Liu, Aihong; Luo, Jianxun; Yin, Hong

    2016-07-01

    Theileria is an obligatory intraerythrocytic protozoan parasite that causes economic losses to the cattle, sheep and goats industry. However, very little information is available on the genomes, transcriptomes, and proteomes of the ovine parasites, Theileria luwenshuni and Theileria uilenbergi. Differences in protein expression between these species were investigated to better understand their biology. Parasites were digested with trypsin, and the resulting peptides labeled with isobaric tags for relative and absolute quantification, followed by LC-MS/MS. More than 670 proteins, classified into categories primarily related to cellular process (29.78%), metabolic process (28.80%), localization (5.22%) and biological regulation (5.00%), were identified. Seventy-one proteins were differentially expressed; T. luwenshuni had 39 proteins more highly expressed than in T. uilenbergi, whereas T. uilenbergi had 32 that were more highly expressed. Several proteins related to parasite virulence and invasion (cysteine proteinase, histone deacetylase, pyruvate kinase, small nuclear ribonucleoprotein and orotate phosphoribosyltransferase) were differentially expressed. Real-time quantitative PCR validated protein expression changes at the transcript level. This is the first report on protein expression for the two most economically important Theileria species in China, and our findings may provide novel opportunities for ovine and caprine theileriosis control. PMID:27018062

  17. Bioinformatic Analysis Reveals High Diversity of Bacterial Genes for Laccase-Like Enzymes

    Science.gov (United States)

    Ausec, Luka; Zakrzewski, Martha; Goesmann, Alexander; Schlüter, Andreas; Mandic-Mulec, Ines

    2011-01-01

    Fungal laccases have been used in various fields ranging from processes in wood and paper industries to environmental applications. Although a few bacterial laccases have been characterized in recent years, prokaryotes have largely been neglected as a source of novel enzymes, in part due to the lack of knowledge about the diversity and distribution of laccases within Bacteria. In this work genes for laccase-like enzymes were searched for in over 2,200 complete and draft bacterial genomes and four metagenomic datasets, using the custom profile Hidden Markov Models for two- and three- domain laccases. More than 1,200 putative genes for laccase-like enzymes were retrieved from chromosomes and plasmids of diverse bacteria. In 76% of the genes, signal peptides were predicted, indicating that these bacterial laccases may be exported from the cytoplasm, which contrasts with the current belief. Moreover, several examples of putatively horizontally transferred bacterial laccase genes were described. Many metagenomic sequences encoding fragments of laccase-like enzymes could not be phylogenetically assigned, indicating considerable novelty. Laccase-like genes were also found in anaerobic bacteria, autotrophs and alkaliphiles, thus opening new hypotheses regarding their ecological functions. Bacteria identified as carrying laccase genes represent potential sources for future biotechnological applications. PMID:22022440

  18. Bioinformatic analysis of the neprilysin (M13 family of peptidases reveals complex evolutionary and functional relationships

    Directory of Open Access Journals (Sweden)

    Pinney John W

    2008-01-01

    Full Text Available Abstract Background The neprilysin (M13 family of endopeptidases are zinc-metalloenzymes, the majority of which are type II integral membrane proteins. The best characterised of this family is neprilysin, which has important roles in inactivating signalling peptides involved in modulating neuronal activity, blood pressure and the immune system. Other family members include the endothelin converting enzymes (ECE-1 and ECE-2, which are responsible for the final step in the synthesis of potent vasoconstrictor endothelins. The ECEs, as well as neprilysin, are considered valuable therapeutic targets for treating cardiovascular disease. Other members of the M13 family have not been functionally characterised, but are also likely to have biological roles regulating peptide signalling. The recent sequencing of animal genomes has greatly increased the number of M13 family members in protein databases, information which can be used to reveal evolutionary relationships and to gain insight into conserved biological roles. Results The phylogenetic analysis successfully resolved vertebrate M13 peptidases into seven classes, one of which appears to be specific to mammals, and insect genes into five functional classes and a series of expansions, which may include inactive peptidases. Nematode genes primarily resolved into groups containing no other taxa, bar the two nematode genes associated with Drosophila DmeNEP1 and DmeNEP4. This analysis reconstructed only one relationship between chordate and invertebrate clusters, that of the ECE sub-group and the DmeNEP3 related genes. Analysis of amino acid utilisation in the active site of M13 peptidases reveals a basis for their biochemical properties. A relatively invariant S1' subsite gives the majority of M13 peptidases their strong preference for hydrophobic residues in P1' position. The greater variation in the S2' subsite may be instrumental in determining the specificity of M13 peptidases for their substrates

  19. Bioinformatics analysis of breast cancer bone metastasis related geneCXCR4

    Institute of Scientific and Technical Information of China (English)

    Heng-Wei; Zhang; Xian-Fu; Sun; Ya-Ning; He; Jun-Tao; Li; Xu-Hui; Guo; Hui; Liu

    2013-01-01

    Objective:To analyze breast cancer bone metastasis related gene-CXCR4.Methods:This research screened breast cancer bone metastasis related genes by high-flux gene chip.Results:It was found that the expressions of 396 genes were different including 165 up-regulations and 231 down-regulations.The expression of chemokine receptor CXCR4 was obviously upregulated in the tissue with breast cancer bone metastasis.Compared with the tissue without hone metastasis,there was significant difference,which indicated that CXCR4 played a vital role in breast cancer bone metastasis.Conclusions:The hioinformatics analysis of CXCR4 can provide a certain basis for the occurrence and diagnosis of breast cancer bone metastasis,target gene therapy and evaluation of prognosis.

  20. A Polyglot Approach to Bioinformatics Data Integration: A Phylogenetic Analysis of HIV-1.

    Science.gov (United States)

    Reisman, Steven; Hatzopoulos, Thomas; Läufer, Konstantin; Thiruvathukal, George K; Putonti, Catherine

    2016-01-01

    As sequencing technologies continue to drop in price and increase in throughput, new challenges emerge for the management and accessibility of genomic sequence data. We have developed a pipeline for facilitating the storage, retrieval, and subsequent analysis of molecular data, integrating both sequence and metadata. Taking a polyglot approach involving multiple languages, libraries, and persistence mechanisms, sequence data can be aggregated from publicly available and local repositories. Data are exposed in the form of a RESTful web service, formatted for easy querying, and retrieved for downstream analyses. As a proof of concept, we have developed a resource for annotated HIV-1 sequences. Phylogenetic analyses were conducted for >6,000 HIV-1 sequences revealing spatial and temporal factors influence the evolution of the individual genes uniquely. Nevertheless, signatures of origin can be extrapolated even despite increased globalization. The approach developed here can easily be customized for any species of interest. PMID:26819543

  1. Bioinformatics clouds for big data manipulation

    Directory of Open Access Journals (Sweden)

    Dai Lin

    2012-11-01

    Full Text Available Abstract As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS, Software as a Service (SaaS, Platform as a Service (PaaS, and Infrastructure as a Service (IaaS, and present our perspectives on the adoption of cloud computing in bioinformatics. Reviewers This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor.

  2. Bioinformatics clouds for big data manipulation

    KAUST Repository

    Dai, Lin

    2012-11-28

    As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics.This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor. 2012 Dai et al.; licensee BioMed Central Ltd.

  3. Flux Analysis of the Trypanosoma brucei Glycolysis Based on a Multiobjective-Criteria Bioinformatic Approach

    Directory of Open Access Journals (Sweden)

    Amine Ghozlane

    2012-01-01

    Full Text Available Trypanosoma brucei is a protozoan parasite of major of interest in discovering new genes for drug targets. This parasite alternates its life cycle between the mammal host(s (bloodstream form and the insect vector (procyclic form, with two divergent glucose metabolism amenable to in vitro culture. While the metabolic network of the bloodstream forms has been well characterized, the flux distribution between the different branches of the glucose metabolic network in the procyclic form has not been addressed so far. We present a computational analysis (called Metaboflux that exploits the metabolic topology of the procyclic form, and allows the incorporation of multipurpose experimental data to increase the biological relevance of the model. The alternatives resulting from the structural complexity of networks are formulated as an optimization problem solved by a metaheuristic where experimental data are modeled in a multiobjective function. Our results show that the current metabolic model is in agreement with experimental data and confirms the observed high metabolic flexibility of glucose metabolism. In addition, Metaboflux offers a rational explanation for the high flexibility in the ratio between final products from glucose metabolism, thsat is, flux redistribution through the malic enzyme steps.

  4. Bioinformatic analysis of microRNA networks following the activation of the constitutive androstane receptor (CAR) in mouse liver.

    Science.gov (United States)

    Hao, Ruixin; Su, Shengzhong; Wan, Yinan; Shen, Frank; Niu, Ben; Coslo, Denise M; Albert, Istvan; Han, Xing; Omiecinski, Curtis J

    2016-09-01

    The constitutive androstane receptor (CAR; NR1I3) is a member of the nuclear receptor superfamily that functions as a xenosensor, serving to regulate xenobiotic detoxification, lipid homeostasis and energy metabolism. CAR activation is also a key contributor to the development of chemical hepatocarcinogenesis in mice. The underlying pathways affected by CAR in these processes are complex and not fully elucidated. MicroRNAs (miRNAs) have emerged as critical modulators of gene expression and appear to impact many cellular pathways, including those involved in chemical detoxification and liver tumor development. In this study, we used deep sequencing approaches with an Illumina HiSeq platform to differentially profile microRNA expression patterns in livers from wild type C57BL/6J mice following CAR activation with the mouse CAR-specific ligand activator, 1,4-bis-[2-(3,5,-dichloropyridyloxy)] benzene (TCPOBOP). Bioinformatic analyses and pathway evaluations were performed leading to the identification of 51 miRNAs whose expression levels were significantly altered by TCPOBOP treatment, including mmu-miR-802-5p and miR-485-3p. Ingenuity Pathway Analysis of the differentially expressed microRNAs revealed altered effector pathways, including those involved in liver cell growth and proliferation. A functional network among CAR targeted genes and the affected microRNAs was constructed to illustrate how CAR modulation of microRNA expression may potentially mediate its biological role in mouse hepatocyte proliferation. This article is part of a Special Issue entitled: Xenobiotic nuclear receptors: New Tricks for An Old Dog, edited by Dr. Wen Xie. PMID:27080131

  5. Bioinformatic evaluation of L-arginine catabolic pathways in 24 cyanobacteria and transcriptional analysis of genes encoding enzymes of L-arginine catabolism in the cyanobacterium Synechocystis sp. PCC 6803

    Directory of Open Access Journals (Sweden)

    Pistorius Elfriede K

    2007-11-01

    Full Text Available Abstract Background So far very limited knowledge exists on L-arginine catabolism in cyanobacteria, although six major L-arginine-degrading pathways have been described for prokaryotes. Thus, we have performed a bioinformatic analysis of possible L-arginine-degrading pathways in cyanobacteria. Further, we chose Synechocystis sp. PCC 6803 for a more detailed bioinformatic analysis and for validation of the bioinformatic predictions on L-arginine catabolism with a transcript analysis. Results We have evaluated 24 cyanobacterial genomes of freshwater or marine strains for the presence of putative L-arginine-degrading enzymes. We identified an L-arginine decarboxylase pathway in all 24 strains. In addition, cyanobacteria have one or two further pathways representing either an arginase pathway or L-arginine deiminase pathway or an L-arginine oxidase/dehydrogenase pathway. An L-arginine amidinotransferase pathway as a major L-arginine-degrading pathway is not likely but can not be entirely excluded. A rather unusual finding was that the cyanobacterial L-arginine deiminases are substantially larger than the enzymes in non-photosynthetic bacteria and that they are membrane-bound. A more detailed bioinformatic analysis of Synechocystis sp. PCC 6803 revealed that three different L-arginine-degrading pathways may in principle be functional in this cyanobacterium. These are (i an L-arginine decarboxylase pathway, (ii an L-arginine deiminase pathway, and (iii an L-arginine oxidase/dehydrogenase pathway. A transcript analysis of cells grown either with nitrate or L-arginine as sole N-source and with an illumination of 50 μmol photons m-2 s-1 showed that the transcripts for the first enzyme(s of all three pathways were present, but that the transcript levels for the L-arginine deiminase and the L-arginine oxidase/dehydrogenase were substantially higher than that of the three isoenzymes of L-arginine decarboxylase. Conclusion The evaluation of 24

  6. Meta-Analysis of Placental Transcriptome Data Identifies a Novel Molecular Pathway Related to Preeclampsia.

    Science.gov (United States)

    van Uitert, Miranda; Moerland, Perry D; Enquobahrie, Daniel A; Laivuori, Hannele; van der Post, Joris A M; Ris-Stalpers, Carrie; Afink, Gijs B

    2015-01-01

    Studies using the placental transcriptome to identify key molecules relevant for preeclampsia are hampered by a relatively small sample size. In addition, they use a variety of bioinformatics and statistical methods, making comparison of findings challenging. To generate a more robust preeclampsia gene expression signature, we performed a meta-analysis on the original data of 11 placenta RNA microarray experiments, representing 139 normotensive and 116 preeclamptic pregnancies. Microarray data were pre-processed and analyzed using standardized bioinformatics and statistical procedures and the effect sizes were combined using an inverse-variance random-effects model. Interactions between genes in the resulting gene expression signature were identified by pathway analysis (Ingenuity Pathway Analysis, Gene Set Enrichment Analysis, Graphite) and protein-protein associations (STRING). This approach has resulted in a comprehensive list of differentially expressed genes that led to a 388-gene meta-signature of preeclamptic placenta. Pathway analysis highlights the involvement of the previously identified hypoxia/HIF1A pathway in the establishment of the preeclamptic gene expression profile, while analysis of protein interaction networks indicates CREBBP/EP300 as a novel element central to the preeclamptic placental transcriptome. In addition, there is an apparent high incidence of preeclampsia in women carrying a child with a mutation in CREBBP/EP300 (Rubinstein-Taybi Syndrome). The 388-gene preeclampsia meta-signature offers a vital starting point for further studies into the relevance of these genes (in particular CREBBP/EP300) and their concomitant pathways as biomarkers or functional molecules in preeclampsia. This will result in a better understanding of the molecular basis of this disease and opens up the opportunity to develop rational therapies targeting the placental dysfunction causal to preeclampsia. PMID:26171964

  7. Meta-Analysis of Placental Transcriptome Data Identifies a Novel Molecular Pathway Related to Preeclampsia.

    Directory of Open Access Journals (Sweden)

    Miranda van Uitert

    Full Text Available Studies using the placental transcriptome to identify key molecules relevant for preeclampsia are hampered by a relatively small sample size. In addition, they use a variety of bioinformatics and statistical methods, making comparison of findings challenging. To generate a more robust preeclampsia gene expression signature, we performed a meta-analysis on the original data of 11 placenta RNA microarray experiments, representing 139 normotensive and 116 preeclamptic pregnancies. Microarray data were pre-processed and analyzed using standardized bioinformatics and statistical procedures and the effect sizes were combined using an inverse-variance random-effects model. Interactions between genes in the resulting gene expression signature were identified by pathway analysis (Ingenuity Pathway Analysis, Gene Set Enrichment Analysis, Graphite and protein-protein associations (STRING. This approach has resulted in a comprehensive list of differentially expressed genes that led to a 388-gene meta-signature of preeclamptic placenta. Pathway analysis highlights the involvement of the previously identified hypoxia/HIF1A pathway in the establishment of the preeclamptic gene expression profile, while analysis of protein interaction networks indicates CREBBP/EP300 as a novel element central to the preeclamptic placental transcriptome. In addition, there is an apparent high incidence of preeclampsia in women carrying a child with a mutation in CREBBP/EP300 (Rubinstein-Taybi Syndrome. The 388-gene preeclampsia meta-signature offers a vital starting point for further studies into the relevance of these genes (in particular CREBBP/EP300 and their concomitant pathways as biomarkers or functional molecules in preeclampsia. This will result in a better understanding of the molecular basis of this disease and opens up the opportunity to develop rational therapies targeting the placental dysfunction causal to preeclampsia.

  8. Bioinformatics for Genome Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Gary J. Olsen

    2005-06-30

    Nesbo, Boucher and Doolittle (2001) used phylogenetic trees of four taxa to assess whether euryarchaeal genes share a common history. They have suggested that of the 521 genes examined, each of the three possible tree topologies relating the four taxa was supported essentially equal numbers of times. They suggest that this might be the result of numerous horizontal gene transfer events, essentially randomizing the relationships between gene histories (as inferred in the 521 gene trees) and organismal relationships (which would be a single underlying tree). Motivated by the fact that the order in which sequences are added to a multiple sequence alignment influences the alignment, and ultimately inferred tree, they were interested in the extent to which the variations among inferred trees might be due to variations in the alignment order. This bears directly on their efforts to evaluate and improve upon methods of multiple sequence alignment. They set out to analyze the influence of alignment order on the tree inferred for 43 genes shared among these same 4 taxa. Because alignments produced by CLUSTALW are directed by a rooted guide tree (the denderogram), there are 15 possible alignment orders of 4 taxa. For each gene they tested all 15 alignment orders, and as a 16th option, allowed CLUSTALW to generate its own guide tree. If we supply all 15 possible rooted guide trees, they expected that at least one of them should be as good at CLUSTAL's own guide tree, but most of the time they differed (sometimes being better than CLUSTAL's default tree and sometimes being worse). The difference seems to be that the user-supplied tree is not given meaningful branch lengths, which effect the assumed probability of amino acid changes. They examined the practicality of modifying CLUSTALW to improve its treatment of user-supplied guide trees. This work became ever increasing bogged down in finding and repairing minor bugs in the CLUSTALW code. This effort was put on hold as we feel that our other proposed approaches will ultimately be better.

  9. Effect of phosphatidylcholine on the level expression of plc genes of Aspergillus fumigatus by real time PCR method and investigation of these genes using bioinformatics analysis.

    Directory of Open Access Journals (Sweden)

    Ali Dehghan-Noodeh

    2014-04-01

    Full Text Available Phosphlipases are a group of enzymes that breakdown phosphatidylcholine (phospholipids molecules producing second products. These produced products have a divers role in the cell like signal transduction and digestion in humans. In this research the effect of phosphatidylcholine on the expression of plc genes of A. fumigatus was studied. The plc genes of this fungus were also interrogated using bioinformatics studies.Real-time PCR was performed to study the expression of plc genes and these genes were interrogated using bioinformatics studies.There was more significant expression for all three plc genes when A. fumigatus was grown on the presence of phosphatidylcholine in the medium. The sequence of plc genes of A. fumigatus was also interrogated using bioinformatics analysis and their relationship with the other microorganisms was investigated.Real-time PCR revealed that afplc1, afplc2 and afplc3 were up-regulated in the presence of phosphatidylcholine. In this study we suggest either the plc's of A. fumigatus were present in an ancestral genome and have become lost in some lineages, or that they have been acquired from other organisms by horizontal gene transfer. We also found that plc's of this fungus appeared to be more closely related to the plant plc's than the bacterial plc's.

  10. Identification of microRNAs from Amur grape (vitis amurensis Rupr. by deep sequencing and analysis of microRNA variations with bioinformatics

    Directory of Open Access Journals (Sweden)

    Wang Chen

    2012-03-01

    Full Text Available Abstract Background MicroRNA (miRNA is a class of functional non-coding small RNA with 19-25 nucleotides in length while Amur grape (Vitis amurensis Rupr. is an important wild fruit crop with the strongest cold resistance among the Vitis species, is used as an excellent breeding parent for grapevine, and has elicited growing interest in wine production. To date, there is a relatively large number of grapevine miRNAs (vv-miRNAs from cultivated grapevine varieties such as Vitis vinifera L. and hybrids of V. vinifera and V. labrusca, but there is no report on miRNAs from Vitis amurensis Rupr, a wild grapevine species. Results A small RNA library from Amur grape was constructed and Solexa technology used to perform deep sequencing of the library followed by subsequent bioinformatics analysis to identify new miRNAs. In total, 126 conserved miRNAs belonging to 27 miRNA families were identified, and 34 known but non-conserved miRNAs were also found. Significantly, 72 new potential Amur grape-specific miRNAs were discovered. The sequences of these new potential va-miRNAs were further validated through miR-RACE, and accumulation of 18 new va-miRNAs in seven tissues of grapevines confirmed by real time RT-PCR (qRT-PCR analysis. The expression levels of va-miRNAs in flowers and berries were found to be basically consistent in identity to those from deep sequenced sRNAs libraries of combined corresponding tissues. We also describe the conservation and variation of va-miRNAs using miR-SNPs and miR-LDs during plant evolution based on comparison of orthologous sequences, and further reveal that the number and sites of miR-SNP in diverse miRNA families exhibit distinct divergence. Finally, 346 target genes for the new miRNAs were predicted and they include a number of Amur grape stress tolerance genes and many genes regulating anthocyanin synthesis and sugar metabolism. Conclusions Deep sequencing of short RNAs from Amur grape flowers and berries identified 72

  11. Chemistry in Bioinformatics

    OpenAIRE

    Mitchell John; Murray-Rust Peter; Rzepa Henry

    2005-01-01

    Abstract Chemical information is now seen as critical for most areas of life sciences. But unlike Bioinformatics, where data is openly available and freely re-usable, most chemical information is closed and cannot be re-distributed without permission. This has led to a failure to adopt modern informatics and software techniques and therefore paucity of chemistry in bioinformatics. New technology, however, offers the hope of making chemical data (compounds and properties) free during the auth...

  12. Integrating subpathway analysis to identify candidate agents for hepatocellular carcinoma.

    Science.gov (United States)

    Wang, Jiye; Li, Mi; Wang, Yun; Liu, Xiaoping

    2016-01-01

    Hepatocellular carcinoma (HCC) is the second most common cause of cancer-associated death worldwide, characterized by a high invasiveness and resistance to normal anticancer treatments. The need to develop new therapeutic agents for HCC is urgent. Here, we developed a bioinformatics method to identify potential novel drugs for HCC by integrating HCC-related and drug-affected subpathways. By using the RNA-seq data from the TCGA (The Cancer Genome Atlas) database, we first identified 1,763 differentially expressed genes between HCC and normal samples. Next, we identified 104 significant HCC-related subpathways. We also identified the subpathways associated with small molecular drugs in the CMap database. Finally, by integrating HCC-related and drug-affected subpathways, we identified 40 novel small molecular drugs capable of targeting these HCC-involved subpathways. In addition to previously reported agents (ie, calmidazolium), our method also identified potentially novel agents for targeting HCC. We experimentally verified that one of these novel agents, prenylamine, induced HCC cell apoptosis using 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide, an acridine orange/ethidium bromide stain, and electron microscopy. In addition, we found that prenylamine not only affected several classic apoptosis-related proteins, including Bax, Bcl-2, and cytochrome c, but also increased caspase-3 activity. These candidate small molecular drugs identified by us may provide insights into novel therapeutic approaches for HCC. PMID:27022281

  13. Characterizing the porcine transcriptional regulatory response to infection by Salmonella: identifying putative new NFkB direct targets through comparative bioinformatics.

    Science.gov (United States)

    We have collected data on host response to infection from RNA prepared from mesenteric lymph node of swine infected with either Salmonella enterica serovar Typhimurium (ST) or S. Choleraesuis (SC) using the porcine Affymetrix GeneChip. We identified 848 (ST) and 1,853 (SC) genes with statistical evi...

  14. Integrative Bioinformatics for Genomics and Proteomics

    OpenAIRE

    Wu, C.H.

    2011-01-01

    Systems integration is becoming the driving force for 21st century biology. Researchers are systematically tackling gene functions and complex regulatory processes by studying organisms at different levels of organization, from genomes and transcriptomes to proteomes and interactomes. To fully realize the value of such high-throughput data requires advanced bioinformatics for integration, mining, comparative analysis, and functional interpretation. We are developing a bioinformatics research ...

  15. No-boundary thinking in bioinformatics research

    OpenAIRE

    Huang, Xiuzhen; Bruce, Barry; Buchan, Alison; Congdon, Clare Bates; Cramer, Carole L.; Jennings, Steven F; Jiang, Hongmei; Li, Zenglu; McClure, Gail; McMullen, Rick; Moore, Jason H.; Nanduri, Bindu; Peckham, Joan; Perkins, Andy; Polson, Shawn W.

    2013-01-01

    Currently there are definitions from many agencies and research societies defining “bioinformatics” as deriving knowledge from computational analysis of large volumes of biological and biomedical data. Should this be the bioinformatics research focus? We will discuss this issue in this review article. We would like to promote the idea of supporting human-infrastructure (HI) with no-boundary thinking (NT) in bioinformatics (HINT).

  16. A Bioinformatics Facility for NASA

    Science.gov (United States)

    Schweighofer, Karl; Pohorille, Andrew

    2006-01-01

    Building on an existing prototype, we have fielded a facility with bioinformatics technologies that will help NASA meet its unique requirements for biological research. This facility consists of a cluster of computers capable of performing computationally intensive tasks, software tools, databases and knowledge management systems. Novel computational technologies for analyzing and integrating new biological data and already existing knowledge have been developed. With continued development and support, the facility will fulfill strategic NASA s bioinformatics needs in astrobiology and space exploration. . As a demonstration of these capabilities, we will present a detailed analysis of how spaceflight factors impact gene expression in the liver and kidney for mice flown aboard shuttle flight STS-108. We have found that many genes involved in signal transduction, cell cycle, and development respond to changes in microgravity, but that most metabolic pathways appear unchanged.

  17. SNPTrack™ : an integrated bioinformatics system for genetic association studies.

    Science.gov (United States)

    Xu, Joshua; Kelly, Reagan; Zhou, Guangxu; Turner, Steven A; Ding, Don; Harris, Stephen C; Hong, Huixiao; Fang, Hong; Tong, Weida

    2012-01-01

    A genetic association study is a complicated process that involves collecting phenotypic data, generating genotypic data, analyzing associations between genotypic and phenotypic data, and interpreting genetic biomarkers identified. SNPTrack is an integrated bioinformatics system developed by the US Food and Drug Administration (FDA) to support the review and analysis of pharmacogenetics data resulting from FDA research or submitted by sponsors. The system integrates data management, analysis, and interpretation in a single platform for genetic association studies. Specifically, it stores genotyping data and single-nucleotide polymorphism (SNP) annotations along with study design data in an Oracle database. It also integrates popular genetic analysis tools, such as PLINK and Haploview. SNPTrack provides genetic analysis capabilities and captures analysis results in its database as SNP lists that can be cross-linked for biological interpretation to gene/protein annotations, Gene Ontology, and pathway analysis data. With SNPTrack, users can do the entire stream of bioinformatics jobs for genetic association studies. SNPTrack is freely available to the public at http://www.fda.gov/ScienceResearch/BioinformaticsTools/SNPTrack/default.htm. PMID:23245293

  18. Bioinformatics resource manager v2.3: an integrated software environment for systems biology with microRNA and cross-species analysis tools

    Directory of Open Access Journals (Sweden)

    Tilton Susan C

    2012-11-01

    Full Text Available Abstract Background MicroRNAs (miRNAs are noncoding RNAs that direct post-transcriptional regulation of protein coding genes. Recent studies have shown miRNAs are important for controlling many biological processes, including nervous system development, and are highly conserved across species. Given their importance, computational tools are necessary for analysis, interpretation and integration of high-throughput (HTP miRNA data in an increasing number of model species. The Bioinformatics Resource Manager (BRM v2.3 is a software environment for data management, mining, integration and functional annotation of HTP biological data. In this study, we report recent updates to BRM for miRNA data analysis and cross-species comparisons across datasets. Results BRM v2.3 has the capability to query predicted miRNA targets from multiple databases, retrieve potential regulatory miRNAs for known genes, integrate experimentally derived miRNA and mRNA datasets, perform ortholog mapping across species, and retrieve annotation and cross-reference identifiers for an expanded number of species. Here we use BRM to show that developmental exposure of zebrafish to 30 uM nicotine from 6–48 hours post fertilization (hpf results in behavioral hyperactivity in larval zebrafish and alteration of putative miRNA gene targets in whole embryos at developmental stages that encompass early neurogenesis. We show typical workflows for using BRM to integrate experimental zebrafish miRNA and mRNA microarray datasets with example retrievals for zebrafish, including pathway annotation and mapping to human ortholog. Functional analysis of differentially regulated (p Conclusions BRM provides the ability to mine complex data for identification of candidate miRNAs or pathways that drive phenotypic outcome and, therefore, is a useful hypothesis generation tool for systems biology. The miRNA workflow in BRM allows for efficient processing of multiple miRNA and mRNA datasets in a single

  19. GProX, a User-Friendly Platform for Bioinformatics Analysis and Visualization of Quantitative Proteomics Data

    OpenAIRE

    Rigbolt, K. T. G.; Vanselow, J. T.; Blagoev, B.

    2011-01-01

    Recent technological advances have made it possible to identify and quantify thousands of proteins in a single proteomics experiment. As a result of these developments, the analysis of data has become the bottleneck of proteomics experiment. To provide the proteomics community with a user-friendly platform for comprehensive analysis, inspection and visualization of quantitative proteomics data we developed the Graphical Proteomics Data Explorer (GProX)1. The program requires no special bioinf...

  20. Microbial bioinformatics 2020.

    Science.gov (United States)

    Pallen, Mark J

    2016-09-01

    Microbial bioinformatics in 2020 will remain a vibrant, creative discipline, adding value to the ever-growing flood of new sequence data, while embracing novel technologies and fresh approaches. Databases and search strategies will struggle to cope and manual curation will not be sustainable during the scale-up to the million-microbial-genome era. Microbial taxonomy will have to adapt to a situation in which most microorganisms are discovered and characterised through the analysis of sequences. Genome sequencing will become a routine approach in clinical and research laboratories, with fresh demands for interpretable user-friendly outputs. The "internet of things" will penetrate healthcare systems, so that even a piece of hospital plumbing might have its own IP address that can be integrated with pathogen genome sequences. Microbiome mania will continue, but the tide will turn from molecular barcoding towards metagenomics. Crowd-sourced analyses will collide with cloud computing, but eternal vigilance will be the price of preventing the misinterpretation and overselling of microbial sequence data. Output from hand-held sequencers will be analysed on mobile devices. Open-source training materials will address the need for the development of a skilled labour force. As we boldly go into the third decade of the twenty-first century, microbial sequence space will remain the final frontier! PMID:27471065

  1. Bioinformatic analysis and molecular modelling of human ameloblastin suggest a two-domain intrinsically unstructured calcium-binding protein

    Czech Academy of Sciences Publication Activity Database

    Vymětal, Jiří; Slabý, I.; Spahr, A.; Vondrášek, Jiří; Lyngstadaas, S. P.

    2008-01-01

    Roč. 116, č. 2 (2008), s. 124-134. ISSN 0909-8836 R&D Projects: GA ČR GA203/05/0009; GA ČR GA203/06/1727; GA MŠk LC512 Grant ostatní: EU(XE) QLK3-CT-2001-00090 Institutional research plan: CEZ:AV0Z40550506 Keywords : ameloblastin * bioinformatic modelling * calcium * intrinsically unstructured protein Subject RIV: CF - Physical ; Theoretical Chemistry Impact factor: 1.957, year: 2008

  2. A Bioinformatics Analysis Reveals a Group of MocR Bacterial Transcriptional Regulators Linked to a Family of Genes Coding for Membrane Proteins

    Directory of Open Access Journals (Sweden)

    Teresa Milano

    2016-01-01

    Full Text Available The MocR bacterial transcriptional regulators are characterized by an N-terminal domain, 60 residues long on average, possessing the winged-helix-turn-helix (wHTH architecture responsible for DNA recognition and binding, linked to a large C-terminal domain (350 residues on average that is homologous to fold type-I pyridoxal 5′-phosphate (PLP dependent enzymes like aspartate aminotransferase (AAT. These regulators are involved in the expression of genes taking part in several metabolic pathways directly or indirectly connected to PLP chemistry, many of which are still uncharacterized. A bioinformatics analysis is here reported that studied the features of a distinct group of MocR regulators predicted to be functionally linked to a family of homologous genes coding for integral membrane proteins of unknown function. This group occurs mainly in the Actinobacteria and Gammaproteobacteria phyla. An analysis of the multiple sequence alignments of their wHTH and AAT domains suggested the presence of specificity-determining positions (SDPs. Mapping of SDPs onto a homology model of the AAT domain hinted at possible structural/functional roles in effector recognition. Likewise, SDPs in wHTH domain suggested the basis of specificity of Transcription Factor Binding Site recognition. The results reported represent a framework for rational design of experiments and for bioinformatics analysis of other MocR subgroups.

  3. Short-term arginine deprivation results in large-scale modulation of hepatic gene expression in both normal and tumor cells: microarray bioinformatic analysis

    Directory of Open Access Journals (Sweden)

    Sabo Edmond

    2006-09-01

    Full Text Available Abstract Background We have reported arginine-sensitive regulation of LAT1 amino acid transporter (SLC 7A5 in normal rodent hepatic cells with loss of arginine sensitivity and high level constitutive expression in tumor cells. We hypothesized that liver cell gene expression is highly sensitive to alterations in the amino acid microenvironment and that tumor cells may differ substantially in gene sets sensitive to amino acid availability. To assess the potential number and classes of hepatic genes sensitive to arginine availability at the RNA level and compare these between normal and tumor cells, we used an Affymetrix microarray approach, a paired in vitro model of normal rat hepatic cells and a tumorigenic derivative with triplicate independent replicates. Cells were exposed to arginine-deficient or control conditions for 18 hours in medium formulated to maintain differentiated function. Results Initial two-way analysis with a p-value of 0.05 identified 1419 genes in normal cells versus 2175 in tumor cells whose expression was altered in arginine-deficient conditions relative to controls, representing 9–14% of the rat genome. More stringent bioinformatic analysis with 9-way comparisons and a minimum of 2-fold variation narrowed this set to 56 arginine-responsive genes in normal liver cells and 162 in tumor cells. Approximately half the arginine-responsive genes in normal cells overlap with those in tumor cells. Of these, the majority was increased in expression and included multiple growth, survival, and stress-related genes. GADD45, TA1/LAT1, and caspases 11 and 12 were among this group. Previously known amino acid regulated genes were among the pool in both cell types. Available cDNA probes allowed independent validation of microarray data for multiple genes. Among genes downregulated under arginine-deficient conditions were multiple genes involved in cholesterol and fatty acid metabolism. Expression of low-density lipoprotein receptor was

  4. Identifying marker typing incompatibilities in linkage analysis.

    OpenAIRE

    Stringham, H M; Boehnke, M.

    1996-01-01

    A common problem encountered in linkage analyses is that execution of the computer program is halted because of genotypes in the data that are inconsistent with Mendelian inheritance. Such inconsistencies may arise because of pedigree errors or errors in typing. In some cases, the source of the inconsistencies is easily identified by examining the pedigree. In others, the error is not obvious, and substantial time and effort are required to identify the responsible genotypes. We have develope...

  5. Human defined antigenic region on the nucleoprotein of Crimean-Congo hemorrhagic fever virus identified using truncated proteins and a bioinformatics approach.

    Science.gov (United States)

    Burt, F J; Samudzi, R R; Randall, C; Pieters, D; Vermeulen, J; Knox, C M

    2013-11-01

    Crimean-Congo hemorrhagic fever virus (CCHFV) is a tick-borne viral zoonosis widely distributed in Africa, Asia and eastern Europe. In this study, amino acid sequence data for the CCHFV nucleoprotein (NP) was used to identify potential linear epitopic regions which were subsequently included in the design of large and small truncated recombinant NP antigens and peptide libraries. Two truncated recombinant CCHFV NP antigens were prepared based on results of prediction studies to include epitopic regions and exclude hydrophobic regions that could influence protein expression and solubility. Serum samples were collected from acute and convalescent patients. An IgG antibody response was detected in 16/16 samples tested using the large recombinant NP-based ELISA and in 2/16 using the small recombinant NP-based ELISA. A total of 60 peptides covering predicted epitopic regions of the NP were synthesized and peptide NRGGDENPRGPVSR at amino acid position 182-195, reacted with 13/16 human serum samples. In summary, functional assays are required to determine the biological activity of predicted epitopes for development of peptide based assays for antibody detection. Bacterially expressed complete NP antigens have previously been shown to be useful tools for antibody detection. Truncation of the antigen to remove the hydrophobic C terminus had no impact on the ability of the antigen to detect IgG antibody in human sera. The results indicate that the region from amino acids 123 to 396 includes a highly antigenic region of the NP with application in development of antibody detection assays. PMID:23933073

  6. Chemistry in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Mitchell John

    2005-06-01

    Full Text Available Abstract Chemical information is now seen as critical for most areas of life sciences. But unlike Bioinformatics, where data is openly available and freely re-usable, most chemical information is closed and cannot be re-distributed without permission. This has led to a failure to adopt modern informatics and software techniques and therefore paucity of chemistry in bioinformatics. New technology, however, offers the hope of making chemical data (compounds and properties free during the authoring process. We argue that the technology is already available; we require a collective agreement to enhance publication protocols.

  7. Comparative transcriptional pathway bioinformatic analysis of dietary restriction, Sir2, p53 and resveratrol life span extension in Drosophila

    OpenAIRE

    Antosh, Michael; Whitaker, Rachel; Kroll, Adam; Hosier, Suzanne; Chang, Chengyi; Bauer, Johannes; Cooper, Leon; Neretti, Nicola; HELFAND, STEPHEN L.

    2011-01-01

    A multiple comparison approach using whole genome transcriptional arrays was used to identify genes and pathways involved in calorie restriction/dietary restriction (DR) life span extension in Drosophila. Starting with a gene centric analysis comparing the changes in common between DR and two DR related molecular genetic life span extending manipulations, Sir2 and p53, lead to a molecular confirmation of Sir2 and p53's similarity with DR and the identification of a small set of commonly regul...

  8. Identifiable Data Files - Medicare Provider Analysis and ...

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Medicare Provider Analysis and Review (MEDPAR) File contains data from claims for services provided to beneficiaries admitted to Medicare certified inpatient...

  9. Bioinformatics and School Biology

    Science.gov (United States)

    Dalpech, Roger

    2006-01-01

    The rapidly changing field of bioinformatics is fuelling the need for suitably trained personnel with skills in relevant biological "sub-disciplines" such as proteomics, transcriptomics and metabolomics, etc. But because of the complexity--and sheer weight of data--associated with these new areas of biology, many school teachers feel…

  10. Identifying MMORPG Bots: A Traffic Analysis Approach

    Directory of Open Access Journals (Sweden)

    Wen-Chin Chen

    2008-11-01

    Full Text Available Massively multiplayer online role playing games (MMORPGs have become extremely popular among network gamers. Despite their success, one of MMORPG's greatest challenges is the increasing use of game bots, that is, autoplaying game clients. The use of game bots is considered unsportsmanlike and is therefore forbidden. To keep games in order, game police, played by actual human players, often patrol game zones and question suspicious players. This practice, however, is labor-intensive and ineffective. To address this problem, we analyze the traffic generated by human players versus game bots and propose general solutions to identify game bots. Taking Ragnarok Online as our subject, we study the traffic generated by human players and game bots. We find that their traffic is distinguishable by 1 the regularity in the release time of client commands, 2 the trend and magnitude of traffic burstiness in multiple time scales, and 3 the sensitivity to different network conditions. Based on these findings, we propose four strategies and two ensemble schemes to identify bots. Finally, we discuss the robustness of the proposed methods against countermeasures of bot developers, and consider a number of possible ways to manage the increasingly serious bot problem.

  11. Uses and challenges of bioinformatic tools in mass spectrometric-based proteomic brain perturbation studies.

    Science.gov (United States)

    Guingab-Cagmat, Joy D; Cagmat, Emilio B; Kobeissy, Firas H; Anagli, John

    2014-01-01

    Mass spectrometry (MS) has become the method of choice to study the proteome of brain injury. The high throughput nature of MS-based proteomic experiments generates massive amount of mass spectral data presenting great challenges in downstream interpretation. Currently, different bioinformatics platforms are available for functional analysis and data mining of MS-generated proteomic data. These tools provide a way to convert data sets to biologically interpretable results and functional outcomes. In this review, a brief overview of the currently available bioinformatics strategies applied to neuroproteomic studies is presented. Application of commercially available bioinformatics software to different brain injury studies demonstrates integration of the data mining and analysis applications into neuroproteomic workflows that can identify major protein markers as well as highlight the biological processes and molecular functions involved. PMID:24449691

  12. Combination of meta-analysis and graph clustering to identify prognostic markers of ESCC

    Directory of Open Access Journals (Sweden)

    Hongyun Gao

    2012-01-01

    Full Text Available Esophageal squamous cell carcinoma (ESCC is one of the most malignant gastrointestinal cancers and occurs at a high frequency rate in China and other Asian countries. Recently, several molecular markers were identified for predicting ESCC. Notwithstanding, additional prognostic markers, with a clear understanding of their underlying roles, are still required. Through bioinformatics, a graph-clustering method by DPClus was used to detect co-expressed modules. The aim was to identify a set of discriminating genes that could be used for predicting ESCC through graph-clustering and GO-term analysis. The results showed that CXCL12, CYP2C9, TGM3, MAL, S100A9, EMP-1 and SPRR3 were highly associated with ESCC development. In our study, all their predicted roles were in line with previous reports, whereby the assumption that a combination of meta-analysis, graph-clustering and GO-term analysis is effective for both identifying differentially expressed genes, and reflecting on their functions in ESCC.

  13. Genome-Wide Gene Expression Analysis Identifies the Proto-oncogene Tyrosine-Protein Kinase Src as a Crucial Virulence Determinant of Infectious Laryngotracheitis Virus in Chicken Cells

    OpenAIRE

    Li, Hai; Wang, Fengjie; Han, Zongxi; Gao, Qi; Li, Huixin; Shao, Yuhao; Sun, Nana; Liu, Shengwang

    2015-01-01

    ABSTRACT Given the side effects of vaccination against infectious laryngotracheitis (ILT), novel strategies for ILT control and therapy are urgently needed. The modulation of host-virus interactions is a promising strategy to combat the virus; however, the interactions between the host and avian ILT herpesvirus (ILTV) are unclear. Using genome-wide transcriptome studies in combination with a bioinformatic analysis, we identified proto-oncogene tyrosine-protein kinase Src (Src) to be an import...

  14. Bioinformatics big data processing

    OpenAIRE

    Cohen-Boulakia, Sarah; Valduriez, Patrick

    2016-01-01

    The volumes of bioinformatics data available on the Web are constantly increasing.Access and joint exploitation of these highly distributed data (i.e, available in distributed Webdata sources) and highly heterogeneous (in text or tabulated les including images, in dierentformats, described with dierent levels of detail and dierent levels of quality ...) is essential forthe biological knowledge to progress. The purpose of this short report is to present in a simpleway the problems of the joint...

  15. Privacy Preserving PCA on Distributed Bioinformatics Datasets

    Science.gov (United States)

    Li, Xin

    2011-01-01

    In recent years, new bioinformatics technologies, such as gene expression microarray, genome-wide association study, proteomics, and metabolomics, have been widely used to simultaneously identify a huge number of human genomic/genetic biomarkers, generate a tremendously large amount of data, and dramatically increase the knowledge on human…

  16. Bioinformatic analysis of the non-structural protein 1 of type 2 dengue virus%登革2型病毒非结构蛋白NS1的生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    齐一鸣; 黄俊琪

    2011-01-01

    目的:分析登革2型病毒非结构蛋白NS1的结构和功能特征并预测其优势抗原表位.方法:利用NCBI、CBS等生物信息学网站和DNAStar、Vector NTI等软件包,分析登革2型病毒NS1的理化性质和结构与功能特征,及可能的空间结构和抗原表位.结果:NS1基因编码352个氨基酸,含12个保守的半胱氨酸.脂质含量相对较多,理化性质不稳定.无分泌型信号肽及跨膜结构,但存在多个糖基化、磷酸化、酰胺化位点.空间结构为一紧凑球形,N端和C端暴露于球体表面,线性B细胞抗原表位的区域较为密集.中段包埋于分子内部,但含有一些与血小板、血管内皮或纤维蛋白素原高度同源的B细胞表位序列,可能在登革出血热的病理过程中发挥重要作用.结论:NS1不仅是一个极具潜力的诊断性抗原,其抗原表位的预测将为登革病毒表位多肽疫苗的开发提供依据.%Objective Predict the structural and functional characteristics of the non-structural protein 1 (NS1) of dengue virus 2, as well as the predominant antigen epitope, by bioinformatics analysis in order to guide the experimental research on its biological function and application. Methods Utilizing the analysis tools provided by NCBI, CBS bioinformatics web sites and combination of bioinformatics software packages , such as DNAStar, Vector NTI, to identify the characteristics of NS1. Results The NS1 gene coding 352 amino acids which include 12 conservative cysteines. It carries no signal peptide in the N terminus and no transmembrane regions but with instable physico-chemical characteristics.The protein comprises of only one compact globular domain in the protein with both of the N-terminnus and C-terminnus fragment exposed on the surface where linear B cell epitopes are possibly intensive. Although embed internal sterically, it is found that some epitopes are highly cognated with thromboplastid and fibrinogen by blast analysis. Deduced conformational

  17. Bioinformatics analysis of potential essential genes that response to the high intraocular pressure on astrocyte due to glaucoma

    Institute of Scientific and Technical Information of China (English)

    Yang; Yang; Jing-Zhu; Duan; Yu; Di; Dong-Mei; Gui; Dian-Wen; Gao

    2015-01-01

    AIM: To study the gene expression response and predict the network in cell due to pressure effects on optic nerve injury of glaucoma.METHODS: We used glaucoma related microarray data in public database [Gene Expression Omnibus(GEO)] to explore the potential gene expression changes as well as correspondent biological process alterations due to increased pressure in astrocytes during glaucoma development.RESULTS: A total of six genes were identified to be related with pressure increasing. Through the annotation and network analysis, we found these genes might be involved in cell morphological remodeling, angiogenesis,mismatch repair.CONCLUSION: Increasing pressure in glaucoma on astrocytes might cause gene expression alterations,which might induce some cellular responses changes.

  18. Identification of microRNA-mRNA interactions in atrial fibrillation using microarray expression profiles and bioinformatics analysis

    Science.gov (United States)

    WANG, TAO; WANG, BIN

    2016-01-01

    The present study integrated microRNA (miRNA) and mRNA expression data obtained from atrial fibrillation (AF) tissues and healthy tissues, in order to identify miRNAs and target genes that may be important in the development of AF. The GSE28954 miRNA expression profile and GSE2240 mRNA gene expression profile were downloaded from the Gene Expression Omnibus. Differentially expressed miRNAs and genes (DEGs) in AF tissues, compared with in control samples, were identified and hierarchically clustered. Subsequently, differentially expressed miRNAs and DEGs were searched for in the miRecords database and TarBase, and were used to construct a regulatory network using Cytoscape. Finally, functional analysis of the miRNA-targeted genes was conducted. After data processing, 71 differentially expressed miRNAs and 390 DEGs were identified between AF and normal tissues. A total of 3,506 miRNA-mRNA pairs were selected, of which 372 were simultaneously predicted by both miRecords and TarBase, and were therefore used to construct the miRNA-mRNA regulatory network. Furthermore, 10 miRNAs and 12 targeted mRNAs were detected, which formed 14 interactive pairs. The miRNA-targeted genes were significantly enriched into 14 Gene Ontology (GO) categories, of which the most significant was gene expression regulation (GO 10468), which was associated with 7 miRNAs and 8 target genes. These results suggest that the screened miRNAs and target genes may be target molecules in AF development, and may be beneficial for the early diagnosis and future treatment of AF. PMID:27082053

  19. Identification of Genetic Defects in 33 Probands with Stargardt Disease by WES-Based Bioinformatics Gene Panel Analysis.

    Directory of Open Access Journals (Sweden)

    Wei Xin

    Full Text Available Stargardt disease (STGD is the most common hereditary macular degeneration in juveniles, with loss of central vision occurring in the first or second decade of life. The aim of this study is to identify the genetic defects in 33 probands with Stargardt disease. Clinical data and genomic DNA were collected from 33 probands from unrelated families with STGD. Variants in coding genes were initially screened by whole exome sequencing. Candidate variants were selected from all known genes associated with hereditary retinal dystrophy and then confirmed by Sanger sequencing. Putative pathogenic variants were further validated in available family members and controls. Potential pathogenic mutations were identified in 19 of the 33 probands (57.6%. These mutations were all present in ABCA4, but not in the other four STGD-associated genes or in genes responsible for other retinal dystrophies. Of the 19 probands, ABCA4 mutations were homozygous in one proband and compound heterozygous in 18 probands, involving 28 variants (13 novel and 15 known. Analysis of normal controls and available family members in 12 of the 19 families further support the pathogenicity of these variants. Clinical manifestation of all probands met the diagnostic criteria of STGD. This study provides an overview of a genetic basis for STGD in Chinese patients. Mutations in ABCA4 are the most common cause of STGD in this cohort. Genetic defects in approximately 42.4% of STGD patients await identification in future studies.

  20. Bioinformatic Challenges in Clinical Diagnostic Application of Targeted Next Generation Sequencing: Experience from Pheochromocytoma.

    Directory of Open Access Journals (Sweden)

    Joakim Crona

    Full Text Available Recent studies have demonstrated equal quality of targeted next generation sequencing (NGS compared to Sanger Sequencing. Whereas these novel sequencing processes have a validated robust performance, choice of enrichment method and different available bioinformatic software as reliable analysis tool needs to be further investigated in a diagnostic setting.DNA from 21 patients with genetic variants in SDHB, VHL, EPAS1, RET, (n=17 or clinical criteria of NF1 syndrome (n=4 were included. Targeted NGS was performed using Truseq custom amplicon enrichment sequenced on an Illumina MiSEQ instrument. Results were analysed in parallel using three different bioinformatics pipelines; (1 Commercially available MiSEQ Reporter, fully automatized and integrated software, (2 CLC Genomics Workbench, graphical interface based software, also commercially available, and ICP (3 an in-house scripted custom bioinformatic tool.A tenfold read coverage was achieved in between 95-98% of targeted bases. All workflows had alignment of reads to SDHA and NF1 pseudogenes. Compared to Sanger sequencing, variant calling revealed a sensitivity ranging from 83 to 100% and a specificity of 99.9-100%. Only MiSEQ reporter identified all pathogenic variants in both sequencing runs.We conclude that targeted next generation sequencing have equal quality compared to Sanger sequencing. Enrichment specificity and the bioinformatic performance need to be carefully assessed in a diagnostic setting. As acceptable accuracy was noted for a fully automated bioinformatic workflow, we suggest that processing of NGS data could be performed without expert bioinformatics skills utilizing already existing commercially available bioinformatics tools.

  1. Identification of microRNAs in the Toxigenic Dinoflagellate Alexandrium catenella by High-Throughput Illumina Sequencing and Bioinformatic Analysis.

    Directory of Open Access Journals (Sweden)

    Huili Geng

    Full Text Available Micro-ribonucleic acids (miRNAs are a large group of endogenous, tiny, non-coding RNAs consisting of 19-25 nucleotides that regulate gene expression at either the transcriptional or post-transcriptional level by mediating gene silencing in eukaryotes. They are considered to be important regulators that affect growth, development, and response to various stresses in plants. Alexandrium catenella is an important marine toxic phytoplankton species that can cause harmful algal blooms (HABs. To date, identification and function analysis of miRNAs in A. catenella remain largely unexamined. In this study, high-throughput sequencing was performed on A. catenella to identify and quantitatively profile the repertoire of small RNAs from two different growth phases. A total of 38,092,056 and 32,969,156 raw reads were obtained from the two small RNA libraries, respectively. In total, 88 mature miRNAs belonging to 32 miRNA families were identified. Significant differences were found in the member number, expression level of various families, and expression abundance of each member within a family. A total of 15 potentially novel miRNAs were identified. Comparative profiling showed that 12 known miRNAs exhibited differential expression between the lag phase and the logarithmic phase. Real-time quantitative RT-PCR (qPCR was performed to confirm the expression of two differentially expressed miRNAs that were one up-regulated novel miRNA (aca-miR-3p-456915, and one down-regulated conserved miRNA (tae-miR159a. The expression trend of the qPCR assay was generally consistent with the deep sequencing result. Target predictions of the 12 differentially expressed miRNAs resulted in 1813 target genes. Gene ontology (GO analysis and the Kyoto Encyclopedia of Genes and Genomes pathway database (KEGG annotations revealed that some miRNAs were associated with growth and developmental processes of the alga. These results provide insights into the roles that miRNAs play in

  2. Best practices in bioinformatics training for life scientists.

    KAUST Repository

    Via, Allegra

    2013-06-25

    The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists.

  3. Best practices in bioinformatics training for life scientists.

    Science.gov (United States)

    Via, Allegra; Blicher, Thomas; Bongcam-Rudloff, Erik; Brazas, Michelle D; Brooksbank, Cath; Budd, Aidan; De Las Rivas, Javier; Dreyer, Jacqueline; Fernandes, Pedro L; van Gelder, Celia; Jacob, Joachim; Jimenez, Rafael C; Loveland, Jane; Moran, Federico; Mulder, Nicola; Nyrönen, Tommi; Rother, Kristian; Schneider, Maria Victoria; Attwood, Teresa K

    2013-09-01

    The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists. PMID:23803301

  4. Bioinformatics analysis and prediction for structure and function of nitric oxide synthase and similar proteins from Plasmodium berghei

    Institute of Scientific and Technical Information of China (English)

    Zhigang Fan; Gang Lv; Lingmin Zhang; Xiufeng Gan; Qiang Wu; Saifeng Zhong; Guogang Yan; Guifen Lin

    2011-01-01

    Objective: To search and analyze nitric oxide synthase (NOS) and similar proteins fromPlasmodium berghei(Pb). Methods: The structure and function of nitric oxide synthase and similar proteins from Plasmodium berghei were analyzed and predicted by bioinformatics. Results: PbNOS were not available, but nicotinamide adenine dinucleotide 2’-phosphate reduced tetrasodium (NADPH)-cytochrome p450 reductase(CPR) were gained. PbCPR was in the nucleus of Plasmodium berghei, while 134aa-229aa domain was localize in nucleolar organizer. The amino acids sequence of PbCPR had the closest genetic relationship with Plasmodium vivax showing a 73% homology. The tertiary structure of PbCPR displayed the forcep-shape with wings, but no wings existed in the tertiary structure of its’ host, Mus musculus(Mm). 137aa-200aa, 201aa-218aa, 220aa-230aa, 232aa-248, 269aa-323aa, 478aa-501aa and 592aa-606aa domains of PbCPR showed no homology with MmCPRs’, and all domains were exposed on the surface of the protein. Conclusions: NOS can’t be found in Plasmodium berghei and other Plasmodium species. PbCPR may be a possible resistance site of antimalarial drug, and the targets of antimalarial drug and vaccine. It may be also one of the mechanisms of immune evasion. This study on Plasmodium berghei may be more suitable to Plasmodium vivax. And137aa-200aa, 201aa-218aa, 220aa-230aa, 232aa-248, 269aa-323aa, 478aa-501aa and 592aa-606aa domains ofPb CPR are more ideal targets of antimalarial drug and vaccine.

  5. Genome-wide bioinformatics analysis of steroid metabolism-associated genes in Nocardioides simplex VKM Ac-2033D.

    Science.gov (United States)

    Shtratnikova, Victoria Y; Schelkunov, Mikhail I; Fokina, Victoria V; Pekov, Yury A; Ivashina, Tanya; Donova, Marina V

    2016-08-01

    Actinobacteria comprise diverse groups of bacteria capable of full degradation, or modification of different steroid compounds. Steroid catabolism has been characterized best for the representatives of suborder Corynebacterineae, such as Mycobacteria, Rhodococcus and Gordonia, with high content of mycolic acids in the cell envelope, while it is poorly understood for other steroid-transforming actinobacteria, such as representatives of Nocardioides genus belonging to suborder Propionibacterineae. Nocardioides simplex VKM Ac-2033D is an important biotechnological strain which is known for its ability to introduce ∆(1)-double bond in various 1(2)-saturated 3-ketosteroids, and perform convertion of 3β-hydroxy-5-ene steroids to 3-oxo-4-ene steroids, hydrolysis of acetylated steroids, reduction of carbonyl groups at C-17 and C-20 of androstanes and pregnanes, respectively. The strain is also capable of utilizing cholesterol and phytosterol as carbon and energy sources. In this study, a comprehensive bioinformatics genome-wide screening was carried out to predict genes related to steroid metabolism in this organism, their clustering and possible regulation. The predicted operon structure and number of candidate gene copies paralogs have been estimated. Binding sites of steroid catabolism regulators KstR and KstR2 specified for N. simplex VKM Ac-2033D have been calculated de novo. Most of the candidate genes grouped within three main clusters, one of the predicted clusters having no analogs in other actinobacteria studied so far. The results offer a base for further functional studies, expand the understanding of steroid catabolism by actinobacteria, and will contribute to modifying of metabolic pathways in order to generate effective biocatalysts capable of producing valuable bioactive steroids. PMID:26832142

  6. An Integrated Bioinformatics Analysis Reveals Divergent Evolutionary Pattern of Oil Biosynthesis in High- and Low-Oil Plants

    Science.gov (United States)

    Zhang, Li; Wang, Shi-Bo; Li, Qi-Gang; Song, Jian; Hao, Yu-Qi; Zhou, Ling; Zheng, Huan-Quan; Dunwell, Jim M.; Zhang, Yuan-Ming

    2016-01-01

    Seed oils provide a renewable source of food, biofuel and industrial raw materials that is important for humans. Although many genes and pathways for acyl-lipid metabolism have been identified, little is known about whether there is a specific mechanism for high-oil content in high-oil plants. Based on the distinct differences in seed oil content between four high-oil dicots (20~50%) and three low-oil grasses (<3%), comparative genome, transcriptome and differential expression analyses were used to investigate this mechanism. Among 4,051 dicot-specific soybean genes identified from 252,443 genes in the seven species, 54 genes were shown to directly participate in acyl-lipid metabolism, and 93 genes were found to be associated with acyl-lipid metabolism. Among the 93 dicot-specific genes, 42 and 27 genes, including CBM20-like SBDs and GPT2, participate in carbohydrate degradation and transport, respectively. 40 genes highly up-regulated during seed oil rapid accumulation period are mainly involved in initial fatty acid synthesis, triacylglyceride assembly and oil-body formation, for example, ACCase, PP, DGAT1, PDAT1, OLEs and STEROs, which were also found to be differentially expressed between high- and low-oil soybean accessions. Phylogenetic analysis revealed distinct differences of oleosin in patterns of gene duplication and loss between high-oil dicots and low-oil grasses. In addition, seed-specific GmGRF5, ABI5 and GmTZF4 were predicted to be candidate regulators in seed oil accumulation. This study facilitates future research on lipid biosynthesis and potential genetic improvement of seed oil content. PMID:27159078

  7. Bioinformatics and functional analysis of an Entamoeba histolytica mannosyltransferase necessary for parasite complement resistance and hepatical infection.

    Directory of Open Access Journals (Sweden)

    Christian Weber

    Full Text Available The glycosylphosphatidylinositol (GPI moiety is one of the ways by which many cell surface proteins, such as Gal/GalNAc lectin and proteophosphoglycans (PPGs attach to the surface of Entamoeba histolytica, the agent of human amoebiasis. It is believed that these GPI-anchored molecules are involved in parasite adhesion to cells, mucus and the extracellular matrix. We identified an E. histolytica homolog of PIG-M, which is a mannosyltransferase required for synthesis of GPI. The sequence and structural analysis led to the conclusion that EhPIG-M1 is composed of one signal peptide and 11 transmembrane domains with two large intra luminal loops, one of which contains the DXD motif, involved in the enzymatic catalysis and conserved in most glycosyltransferases. Expressing a fragment of the EhPIG-M1 encoding gene in antisense orientation generated parasite lines diminished in EhPIG-M1 levels; these lines displayed reduced GPI production, were highly sensitive to complement and were dramatically inhibited for amoebic abscess formation. The data suggest a role for GPI surface anchored molecules in the survival of E. histolytica during pathogenesis.

  8. Integrative bioinformatics analysis of genomic and proteomic approaches to understand the transcriptional regulatory program in coronary artery disease pathways.

    Directory of Open Access Journals (Sweden)

    Rajani Kanth Vangala

    Full Text Available Patients with cardiovascular disease show a panel of differentially regulated serum biomarkers indicative of modulation of several pathways from disease onset to progression. Few of these biomarkers have been proposed for multimarker risk prediction methods. However, the underlying mechanism of the expression changes and modulation of the pathways is not yet addressed in entirety. Our present work focuses on understanding the regulatory mechanisms at transcriptional level by identifying the core and specific transcription factors that regulate the coronary artery disease associated pathways. Using the principles of systems biology we integrated the genomics and proteomics data with computational tools. We selected biomarkers from 7 different pathways based on their association with the disease and assayed 24 biomarkers along with gene expression studies and built network modules which are highly regulated by 5 core regulators PPARG, EGR1, ETV1, KLF7 and ESRRA. These network modules in turn comprise of biomarkers from different pathways showing that the core regulatory transcription factors may work together in differential regulation of several pathways potentially leading to the disease. This kind of analysis can enhance the elucidation of mechanisms in the disease and give better strategies of developing multimarker module based risk predictions.

  9. Prediction of antigenic sites on ALS1 and HWP1 protein sequences in vaginal isolated C. albicans of using bioinformatics analysis

    Directory of Open Access Journals (Sweden)

    Mona Pakdel

    2015-04-01

    Full Text Available Background and Aim: The ability to predict antigenic sites on proteins is of major importance for medication. The aim of this study was to predict the antigenic sites on Agglutin in Like Sequence (ALS1 and Hyphal Wall Protein Sequences (HWP1 in Candida albicans isolated of vaginal infections using Physico-Chemical Profiles server. Materials and Methods: 7 isolates were obtained from women with vaginal infection which were collected from various medical centers of Tehran in 2011 and 2012. At the first,DNA was extracted  by Phenol-Chloroform method. Multiplex PCR was performed by using specific primers. In order to do bioinformatic studies, the genes were sequenced and then translated. Antigenic sites of protein sequences were identified by Physico-Chemical Profiles program. Results: The results showed that the presence of two genes als1 and hwp1 in isolates. In ALS1 and HWP1, respectively 2 and 1 antigenic site with the most antigenicity were identified. Conclusions: According to previous studies, Serine and Threonine phosphorylation is an important mechanism in pathogenesis of ALS1 and HWP1 proteins. Results in this study showed that serine and threonine are the most amino acids in the antigenic sites with high antigenicity property.

  10. 焦虑症外周血microRNA的生物信息学分析%Bioinformatics analysis of differently expressed microRNAs in anxiety disorder

    Institute of Scientific and Technical Information of China (English)

    范惠民; 牛威; 何明骏; 孔令明; 仲爱芳; 张巧丽; 闫妍; 张理义

    2015-01-01

    Objective To identify differentially expressed microRNAs(miRNA) in peripheral blood mononuclear cells(PBMCs) of anxiety patients and predict their target genes and function by bioinformatics analysis.Methods The miRNA expression profiles were determined using an Affymetrix array.To validate the results, real-time quantitative polymerase chain reaction(qRT-PCR) analysis in a larger cohort was employed.The targets of the differentially expressed miRNAs were predicted by Target Scan, miRBD, and DIANA-microT-CDS, and the results were analyzed by gene ontology(GO) and KEGG pathway analysis using FunNet.Results MicroRNA microarray chip analysis has identified 7 miRNAs were detected with significant changes in expression in PBMCs of anxiety patients.qRT-PCR analysis has confirmed that the expression levels of 5 miRNAs(has-miR-4484, has-miR-4505, has-miR-4674, has-miR-501-3p and has-miR-663) were up-regulated.Intersecting the genes by Target Scan, miRBD, and DIANA-microT-CDS has predicted 195 targets.GO analysis showed that biological processes regulated by the predicted target genes have included diverse terms.Some terms, e.g., nervous system development, nerve growth factor receptor signaling pathway, neuron migration, dendrite development, regulation of neuron projection development,midbrain development , regulation of excitatory postsynaptic membrane potential, gliogenesis, dendrite morphogenesis, etc.have direct relationship with the central nervous system and brain functions.Pathway analysis showed that a significant enrichment in several pathways related to neuronal brain functions such as glutamatergic synapse, axon guidance, calcium signaling pathway, MAPK signaling pathway, GnRH signaling pathway, Wnt signaling pathway, gap junction, long-term potentiation and VEGF signaling pathway, etc.Among the five microRNAs, has-miR-4484, has-miR-4505, has-miR-4674 and has-miR-501-3p may have more important regulatory functions.Conclusion Five miRNAs (has-miR-4484, has

  11. 草菇α-淀粉酶基因的生物信息学分析%Bioinformatic Analysis of α-Amylase Genes in Volvariella volvacea

    Institute of Scientific and Technical Information of China (English)

    杜慕云; 杨仁德; 李剑; 谢宝贵

    2014-01-01

    Five genes (GME 2151、GME 6695、GME 9075、GME 1069 and GME 10705 ) were identified as encoding α-amylases in Volvariella volvacea , the molecular weights of which varied from 38.8 kD to 64.4 kD.Bioinformatic methods based on genome and transcriptome sequences have been used to analyze gene intron:exon distribution patterns and the physicochemical properties of the encoded α-amylases.Signal peptides,sub-cellular localization patterns and functional sites of the α-amylases were predicted,and a phylogenetic tree was constructed based onα-amylases from different fungi.Serine phosphorylation sites were the primary sites of amylase protein phosphorylation. The amylases contained signal peptides, transmembrane helices,conserved amino acid residues,similar three dimensional structures of amylase,and were located both intra-and extracellularly.Analysis of the phylogenetic tree revealed that the α-amylases were of two types:GME9075 and GME10698 belonged to α-amylase type I,and GME2151,GME6695 and GME10705 α-amylase type II. This is consistent with the classification of amylases from other basidiomycetes.Our data provide useful information relating to matrix degradation by the mycelium of V .volvacea and other macro-basidiomycetes.%基于草菇(Volvariella volvacea )基因组和转录组数据,通过生物信息学的方法对草菇α-淀粉酶基因进行基本理化性质、内含子和外显子结构、信号肽、亚细胞定位和功能位点的预测与分析,并构建系统发育树。结果表明:编码草菇α-淀粉酶的基因有5个,分别为 GME 2151、GME 6695、GME 9075、GME 10698和GME 10705;5个基因编码的蛋白相对分子量介于38.8~64.6 kD 之间,磷酸化以 Ser 位点为主,大都存在信号肽,亚细胞定位在细胞外,保守结构域和空间结构相似度较高。和其它的担子菌一样,草菇α-淀粉酶可以分为两类:GME9075和 GME10698归为α-淀粉酶Ⅰ类,GME2151、GME6695和 GME10705属于α-淀粉酶Ⅱ类。

  12. 瘢痕疙瘩相关基因的生物信息学分析%Literature Mining and Bioinformatic Analysis of Dysregulated Genes in Keloid

    Institute of Scientific and Technical Information of China (English)

    边曦; 黄琛; 李博仑; 秦泽莲

    2012-01-01

    Objective To explore the pathogenesis of keloid by comparing the gene expression in keloid and normal skin tissues, so that to seek new therapeutic approaches for keloid. Methods The differentially expressed genes between keloid and normal skin were obtained by mining PubMed. The dysregulated genes in keloid were analyzed by bioinformatics methods, including protein-protein interaction networks, biological pathways, gene ontology and functional annotation clustering analysis. Results Eight differential gene eipression datasets and 922 articles were obtained. A total of 94 dysregulated genes in keloid were identified (71 up-regulated genes and 23 down-regulated genes). Eighty-six genes were found to encode proteins with interaction network, including TGFB1, FN1, COL1A1, MMP9, VEGFA, TP53, IL6 and MMP2 as the central nodes for this network. The dysregulated genes in keloid were involved in a variety of biological pathways, including signal transduction and tumor formation. Furthermore, the dysregulated genes in keloid played important roles in biological processes of apoptosis and cell motility. Additionally, some of the dysregulated genes participated in cellular components expression, forming such as cell membrane structure, extracellular matrix and collagen components. Conclusions Key genes including TGFB1, FN1, COL1A1, MMP9, VEGFA, TP53, IL6, and MMP2, along with TGF- β signal transduction, cell proliferation and apoptosis, tumor formation may play important roles in the development of keloid.%目的 比较瘢痕疙瘩与正常皮肤的基因表达差异,从分子水平探讨瘢痕疙瘩的发病机制,为临床治疗提供新思路. 方法 用PubMed数据库文献检索瘢痕疙瘩与正常皮肤的差异表达基因,对与瘢痕疙瘩相关的基因进行蛋白-蛋白相互作用网络、生物学通路、基因本体( gene ontology,GO)和功能注释聚类的生物信息学分析. 结果 获得差异表达基因谱8个和文献922篇,

  13. Protein functional links in Trypanosoma brucei, identified by gene fusion analysis

    Directory of Open Access Journals (Sweden)

    Trimpalis Philip

    2011-07-01

    Full Text Available Abstract Background Domain or gene fusion analysis is a bioinformatics method for detecting gene fusions in one organism by comparing its genome to that of other organisms. The occurrence of gene fusions suggests that the two original genes that participated in the fusion are functionally linked, i.e. their gene products interact either as part of a multi-subunit protein complex, or in a metabolic pathway. Gene fusion analysis has been used to identify protein functional links in prokaryotes as well as in eukaryotic model organisms, such as yeast and Drosophila. Results In this study we have extended this approach to include a number of recently sequenced protists, four of which are pathogenic, to identify fusion linked proteins in Trypanosoma brucei, the causative agent of African sleeping sickness. We have also examined the evolution of the gene fusion events identified, to determine whether they can be attributed to fusion or fission, by looking at the conservation of the fused genes and of the individual component genes across the major eukaryotic and prokaryotic lineages. We find relatively limited occurrence of gene fusions/fissions within the protist lineages examined. Our results point to two trypanosome-specific gene fissions, which have recently been experimentally confirmed, one fusion involving proteins involved in the same metabolic pathway, as well as two novel putative functional links between fusion-linked protein pairs. Conclusions This is the first study of protein functional links in T. brucei identified by gene fusion analysis. We have used strict thresholds and only discuss results which are highly likely to be genuine and which either have already been or can be experimentally verified. We discuss the possible impact of the identification of these novel putative protein-protein interactions, to the development of new trypanosome therapeutic drugs.

  14. Bioinformatics in microbial biotechnology – a mini review

    Directory of Open Access Journals (Sweden)

    Bansal Arvind K

    2005-06-01

    Full Text Available Abstract The revolutionary growth in the computation speed and memory storage capability has fueled a new era in the analysis of biological data. Hundreds of microbial genomes and many eukaryotic genomes including a cleaner draft of human genome have been sequenced raising the expectation of better control of microorganisms. The goals are as lofty as the development of rational drugs and antimicrobial agents, development of new enhanced bacterial strains for bioremediation and pollution control, development of better and easy to administer vaccines, the development of protein biomarkers for various bacterial diseases, and better understanding of host-bacteria interaction to prevent bacterial infections. In the last decade the development of many new bioinformatics techniques and integrated databases has facilitated the realization of these goals. Current research in bioinformatics can be classified into: (i genomics – sequencing and comparative study of genomes to identify gene and genome functionality, (ii proteomics – identification and characterization of protein related properties and reconstruction of metabolic and regulatory pathways, (iii cell visualization and simulation to study and model cell behavior, and (iv application to the development of drugs and anti-microbial agents. In this article, we will focus on the techniques and their limitations in genomics and proteomics. Bioinformatics research can be classified under three major approaches: (1 analysis based upon the available experimental wet-lab data, (2 the use of mathematical modeling to derive new information, and (3 an integrated approach that integrates search techniques with mathematical modeling. The major impact of bioinformatics research has been to automate the genome sequencing, automated development of integrated genomics and proteomics databases, automated genome comparisons to identify the genome function, automated derivation of metabolic pathways, gene

  15. Functional and bioinformatics analysis of two Campylobacter jejuni homologs of the thiol-disulfide oxidoreductase, DsbA.

    Directory of Open Access Journals (Sweden)

    Anna D Grabowska

    Full Text Available BACKGROUND: Bacterial Dsb enzymes are involved in the oxidative folding of many proteins, through the formation of disulfide bonds between their cysteine residues. The Dsb protein network has been well characterized in cells of the model microorganism Escherichia coli. To gain insight into the functioning of the Dsb system in epsilon-Proteobacteria, where it plays an important role in the colonization process, we studied two homologs of the main Escherichia coli Dsb oxidase (EcDsbA that are present in the cells of the enteric pathogen Campylobacter jejuni, the most frequently reported bacterial cause of human enteritis in the world. METHODS AND RESULTS: Phylogenetic analysis suggests the horizontal transfer of the epsilon-Proteobacterial DsbAs from a common ancestor to gamma-Proteobacteria, which then gave rise to the DsbL lineage. Phenotype and enzymatic assays suggest that the two C. jejuni DsbAs play different roles in bacterial cells and have divergent substrate spectra. CjDsbA1 is essential for the motility and autoagglutination phenotypes, while CjDsbA2 has no impact on those processes. CjDsbA1 plays a critical role in the oxidative folding that ensures the activity of alkaline phosphatase CjPhoX, whereas CjDsbA2 is crucial for the activity of arylsulfotransferase CjAstA, encoded within the dsbA2-dsbB-astA operon. CONCLUSIONS: Our results show that CjDsbA1 is the primary thiol-oxidoreductase affecting life processes associated with bacterial spread and host colonization, as well as ensuring the oxidative folding of particular protein substrates. In contrast, CjDsbA2 activity does not affect the same processes and so far its oxidative folding activity has been demonstrated for one substrate, arylsulfotransferase CjAstA. The results suggest the cooperation between CjDsbA2 and CjDsbB. In the case of the CjDsbA1, this cooperation is not exclusive and there is probably another protein to be identified in C. jejuni cells that acts to re

  16. GALT Protein Database, a Bioinformatics Resource for the Manage-ment and Analysis of Structural Features of a Galactosemia-related Protein and Its Mutants

    Institute of Scientific and Technical Information of China (English)

    Antonio d'Acierno; Angelo Facchiano; Anna Marabotti

    2009-01-01

    We describe the GALT-Prot database and its related web-based application that have been developed to collect information about the structural and functional effects of mutations on the human enzyme galactose-1-phosphate uridyltransferase (GALT) involved in the genetic disease named galactosemia type Ⅰ. Besides a list of missense mutations at gene and protein sequence levels, GALT-Prot reports the analysis results of mutant GALT structures. In addition to the structural information about the wild-type enzyme, the database also includes structures of over 100 single point mutants simulated by means of a computational procedure, and the analysis to each mutant was made with several bioinformatics programs in order to investigate the effect of the mutations. The web-based interface allows querying of the database, and several links are also provided in order to guarantee a high integration with other resources already present on the web. Moreover, the architecture of the database and the web application is flexible and can be easily adapted to store data related to other proteins with point mutations. GALT-Prot is freely available at http://bioinformatica.isa.cnr.it/GALT/.

  17. Systematic enrichment analysis of gene expression profiling studies identifies consensus pathways implicated in colorectal cancer development

    Directory of Open Access Journals (Sweden)

    Jesús Lascorz

    2011-01-01

    Full Text Available Background: A large number of gene expression profiling (GEP studies on colorectal carcinogenesis have been performed but no reliable gene signature has been identified so far due to the lack of reproducibility in the reported genes. There is growing evidence that functionally related genes, rather than individual genes, contribute to the etiology of complex traits. We used, as a novel approach, pathway enrichment tools to define functionally related genes that are consistently up- or down-regulated in colorectal carcinogenesis. Materials and Methods: We started the analysis with 242 unique annotated genes that had been reported by any of three recent meta-analyses covering GEP studies on genes differentially expressed in carcinoma vs normal mucosa. Most of these genes (218, 91.9% had been reported in at least three GEP studies. These 242 genes were submitted to bioinformatic analysis using a total of nine tools to detect enrichment of Gene Ontology (GO categories or Kyoto Encyclopedia of Genes and Genomes (KEGG pathways. As a final consistency criterion the pathway categories had to be enriched by several tools to be taken into consideration. Results: Our pathway-based enrichment analysis identified the categories of ribosomal protein constituents, extracellular matrix receptor interaction, carbonic anhydrase isozymes, and a general category related to inflammation and cellular response as significantly and consistently overrepresented entities. Conclusions: We triaged the genes covered by the published GEP literature on colorectal carcinogenesis and subjected them to multiple enrichment tools in order to identify the consistently enriched gene categories. These turned out to have known functional relationships to cancer development and thus deserve further investigation.

  18. Alteration of microRNA expression in cerebrospinal fluid of unconscious patients after traumatic brain injury and a bioinformatic analysis of related single nucleotide polymorphisms

    Institute of Scientific and Technical Information of China (English)

    Wen-Dong You; Qi-Lin Tang; Lei Wang; Jin Lei; Jun-Feng Feng; Qing Mao; Guo-Yi Gao

    2016-01-01

    Purpose:It is becoming increasingly clear that genetic factors play a role in traumatic brain injury (TBI),whether in modifying clinical outcome after TBI or determining susceptibility to it.MicroRNAs are small RNA molecules involved in various pathophysiological processes by repressing target genes at the posttranscriptional level,and TBI alters microRNA expression levels in the hippocampus and cortex.This study was designed to detect differentially expressed microRNAs in the cerebrospinal fluid (CSF) of TBI patients remaining unconscious two weeks after initial injury and to explore related single nucleotide polymorphisms (SNPs).Methods:We used a microarray platform to detect differential microRNA expression levels in CSF samples from patients with post-traumatic coma compared with samples from controls.A bioinformatic scan was performed covering microRNA gene promoter regions to identify potential functional SNPs.Results:Totally 26 coma patients and 21 controls were included in this study,with similar distribution of age and gender between the two groups.Microarray showed that fourteen microRNAs were differentially expressed,ten at higher and four at lower expression levels in CSF of traumatic coma patients compared with controls (p < 0.05).One SNP (rs11851174 allele:C/T) was identified in the motif area of the microRNA hsa-miR-431-3P gene promoter region.Conclusion:The altered microRNA expression levels in CSF after brain injury together with SNP identified within the microRNA gene promoter area provide a new perspective on the mechanism of impaired consciousness after TBI.Further studies are needed to explore the association between the specific microRNAs and their related SNPs with post-traumatic unconsciousness.

  19. Virtual Bioinformatics Distance Learning Suite

    Science.gov (United States)

    Tolvanen, Martti; Vihinen, Mauno

    2004-01-01

    Distance learning as a computer-aided concept allows students to take courses from anywhere at any time. In bioinformatics, computers are needed to collect, store, process, and analyze massive amounts of biological and biomedical data. We have applied the concept of distance learning in virtual bioinformatics to provide university course material…

  20. Microfluidic single-cell transcriptional analysis rationally identifies novel surface marker profiles to enhance cell-based therapies.

    Science.gov (United States)

    Rennert, Robert C; Januszyk, Michael; Sorkin, Michael; Rodrigues, Melanie; Maan, Zeshaan N; Duscher, Dominik; Whittam, Alexander J; Kosaraju, Revanth; Chung, Michael T; Paik, Kevin; Li, Alexander Y; Findlay, Michael; Glotzbach, Jason P; Butte, Atul J; Gurtner, Geoffrey C

    2016-01-01

    Current progenitor cell therapies have only modest efficacy, which has limited their clinical adoption. This may be the result of a cellular heterogeneity that decreases the number of functional progenitors delivered to diseased tissue, and prevents correction of underlying pathologic cell population disruptions. Here, we develop a high-resolution method of identifying phenotypically distinct progenitor cell subpopulations via single-cell transcriptional analysis and advanced bioinformatics. When combined with high-throughput cell surface marker screening, this approach facilitates the rational selection of surface markers for prospective isolation of cell subpopulations with desired transcriptional profiles. We establish the usefulness of this platform in costly and highly morbid diabetic wounds by identifying a subpopulation of progenitor cells that is dysfunctional in the diabetic state, and normalizes diabetic wound healing rates following allogeneic application. We believe this work presents a logical framework for the development of targeted cell therapies that can be customized to any clinical application. PMID:27324848

  1. Engineering BioInformatics

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    @@ With the completion of human genome sequencing, a new era of bioinformatics st arts. On one hand, due to the advance of high throughput DNA microarray technol ogies, functional genomics such as gene expression information has increased exp onentially and will continue to do so for the foreseeable future. Conventional m eans of storing, analysing and comparing related data are already overburdened. Moreover, the rich information in genes , their functions and their associated wide biological implication requires new technologies of analysing data that employ sophisticated statistical and machine learning algorithms, powerful com puters and intensive interaction together different data sources such as seque nce data, gene expression data, proteomics data and metabolic pathway informati on to discover complex genomic structures and functional patterns with other bi ological process to gain a comprehensive understanding of cell physiology.

  2. Emergent Computation Emphasizing Bioinformatics

    CERN Document Server

    Simon, Matthew

    2005-01-01

    Emergent Computation is concerned with recent applications of Mathematical Linguistics or Automata Theory. This subject has a primary focus upon "Bioinformatics" (the Genome and arising interest in the Proteome), but the closing chapter also examines applications in Biology, Medicine, Anthropology, etc. The book is composed of an organized examination of DNA, RNA, and the assembly of amino acids into proteins. Rather than examine these areas from a purely mathematical viewpoint (that excludes much of the biochemical reality), the author uses scientific papers written mostly by biochemists based upon their laboratory observations. Thus while DNA may exist in its double stranded form, triple stranded forms are not excluded. Similarly, while bases exist in Watson-Crick complements, mismatched bases and abasic pairs are not excluded, nor are Hoogsteen bonds. Just as there are four bases naturally found in DNA, the existence of additional bases is not ignored, nor amino acids in addition to the usual complement of...

  3. Next Generation Sequencing of Elite Berry Germplasm and Data Analysis Using a Bioinformatics Pipeline for Virus Detection and Discovery

    Science.gov (United States)

    Berry crops (members of the genera Fragaria, Ribes, Rubus, Sambucus and Vaccinium) are known hosts for more than 70 viruses and new ones are identified continually. In modern berry cultivars, viruses tend to be be asymptomatic in single infections and symptoms only develop after plants accumulate m...

  4. Next-Generation Sequencing of Elite Berry Germplasm and Data Analysis Using a Bioinformatics Pipeline for Virus Detection and Discovery

    Science.gov (United States)

    Berry crops (members of the genera Fragaria, Ribes, Rubus, Sambucus and Vaccinium) are known hosts for more than 70 viruses and new ones are identified frequently. In modern berry cultivars, viruses tend to be asymptomatic in single infections and symptoms only develop after plants accumulate multip...

  5. MISIS-2: A bioinformatics tool for in-depth analysis of small RNAs and representation of consensus master genome in viral quasispecies.

    Science.gov (United States)

    Seguin, Jonathan; Otten, Patricia; Baerlocher, Loïc; Farinelli, Laurent; Pooggin, Mikhail M

    2016-07-01

    In most eukaryotes, small RNA (sRNA) molecules such as miRNAs, siRNAs and piRNAs regulate gene expression and repress transposons and viruses. AGO/PIWI family proteins sort functional sRNAs based on size, 5'-nucleotide and other sequence features. In plants and some animals, viral sRNAs are extremely diverse and cover the entire viral genome sequences, which allows for de novo reconstruction of a complete viral genome by deep sequencing and bioinformatics analysis of viral sRNAs. Previously, we have developed a tool MISIS to view and analyze sRNA maps of viruses and cellular genome regions which spawn multiple sRNAs. Here we describe a new release of MISIS, MISIS-2, which enables to determine and visualize a consensus sequence and count sRNAs of any chosen sizes and 5'-terminal nucleotide identities. Furthermore we demonstrate the utility of MISIS-2 for identification of single nucleotide polymorphisms (SNPs) at each position of a reference sequence and reconstruction of a consensus master genome in evolving viral quasispecies. MISIS-2 is a Java standalone program. It is freely available along with the source code at the website http://www.fasteris.com/apps. PMID:26994965

  6. Expression Data Analysis to Identify Biomarkers Associated with Asthma in Children

    OpenAIRE

    Wen Xu

    2014-01-01

    Asthma is characterized by recurrent episodes of wheezing, shortness of breath, chest tightness, and coughing. It is usually caused by a combination of complex and incompletely understood environmental and genetic interactions. We obtained gene expression data with high-throughput screening and identified biomarkers of children's asthma using bioinformatics tools. Next, we explained the pathogenesis of children's asthma from the perspective of gene regulatory networks: DAVID was applied to pe...

  7. Forensic Bioinformatics: An innovative technological advancement in the field of Forensic Medicine and Diagnosis

    Directory of Open Access Journals (Sweden)

    Kumar Ajay

    2012-01-01

    Full Text Available Background: The role of Bioinformatics in this modern age of technology advancement can not be over-emphasized. Aim: This study reviews the principle, techniques, and applications of Forensic Bioinformatics. Methods and Materials: Literature searches were done to identify relevant studies. Results: The concepts of sequence annotation and whole genome sequencing were possible due to the assimilation of software based tools which are exclusively responsible for the segregation of bulk genomic data. DNA profiling produces profiles which are the encrypted sets of numbers that reflect a person's DNA makeup, which can also be used as the person's identifier. Implementation of automated analysis system coupled with latest computer based software’s making the results easy to comprehend. Major application of forensic Bioinformatics in the field of forensic science includes quick, bulk and precise review of the DNA evidence with the intent of finding and drawing attention to recurring problems so that the testing continues to better and more reliable. Present day, Genetic Counsellors are also used the derived information of Genomic data for creating pedigree in case of genetic disorders. Conclusion: It is important that with the usefulness of Forensic Bioinformatics, a far greater commitment to openness and transparency and a greater availability of documents to public scrutiny is recommended.

  8. Bioconductor: open software development for computational biology and bioinformatics

    DEFF Research Database (Denmark)

    Gentleman, R.C.; Carey, V.J.; Bates, D.M.;

    2004-01-01

    The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry into interdisci......The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry into...... interdisciplinary scientific research, and promoting the achievement of remote reproducibility of research results. We describe details of our aims and methods, identify current challenges, compare Bioconductor to other open bioinformatics projects, and provide working examples....

  9. An Integrated Bioinformatics Analysis Reveals Divergent Evolutionary Pattern of Oil Biosynthesis in High- and Low-Oil Plants

    OpenAIRE

    Zhang, Li; Wang, Shi-Bo; Li, Qi-Gang; Song, Jian; Hao, Yu-Qi; Zhou, Ling; Zheng, Huan-Quan; Jim M Dunwell; Zhang, Yuan-Ming

    2016-01-01

    Seed oils provide a renewable source of food, biofuel and industrial raw materials that is important for humans. Although many genes and pathways for acyl-lipid metabolism have been identified, little is known about whether there is a specific mechanism for high-oil content in high-oil plants. Based on the distinct differences in seed oil content between four high-oil dicots (20~50%) and three low-oil grasses (

  10. Identification of Genetic Defects in 33 Probands with Stargardt Disease by WES-Based Bioinformatics Gene Panel Analysis

    OpenAIRE

    Xin, Wei; Xiao, Xueshan; Li, Shiqiang; Jia, Xiaoyun; Guo, Xiangming; Zhang, Qingjiong

    2015-01-01

    Stargardt disease (STGD) is the most common hereditary macular degeneration in juveniles, with loss of central vision occurring in the first or second decade of life. The aim of this study is to identify the genetic defects in 33 probands with Stargardt disease. Clinical data and genomic DNA were collected from 33 probands from unrelated families with STGD. Variants in coding genes were initially screened by whole exome sequencing. Candidate variants were selected from all known genes associa...

  11. Adapting bioinformatics curricula for big data

    OpenAIRE

    Greene, Anna C.; Giffin, Kristine A.; Greene, Casey S; Jason H Moore

    2015-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these...

  12. Bioinformatics analyses for signal transduction networks

    Institute of Scientific and Technical Information of China (English)

    LIU Wei; LI Dong; ZHU YunPing; HE FuChu

    2008-01-01

    Research in signaling networks contributes to a deeper understanding of organism living activities. With the development of experimental methods in the signal transduction field, more and more mechanisms of signaling pathways have been discovered. This paper introduces such popular bioin-formatics analysis methods for signaling networks as the common mechanism of signaling pathways and database resource on the Internet, summerizes the methods of analyzing the structural properties of networks, including structural Motif finding and automated pathways generation, and discusses the modeling and simulation of signaling networks in detail, as well as the research situation and tendency in this area. Now the investigation of signal transduction is developing from small-scale experiments to large-scale network analysis, and dynamic simulation of networks is closer to the real system. With the investigation going deeper than ever, the bioinformatics analysis of signal transduction would have immense space for development and application.

  13. Agile parallel bioinformatics workflow management using Pwrake

    Directory of Open Access Journals (Sweden)

    Tanaka Masahiro

    2011-09-01

    Full Text Available Abstract Background In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environment are often prioritized in scientific workflow management. These features have a greater affinity with the agile software development method through iterative development phases after trial and error. Here, we show the application of a scientific workflow system Pwrake to bioinformatics workflows. Pwrake is a parallel workflow extension of Ruby's standard build tool Rake, the flexibility of which has been demonstrated in the astronomy domain. Therefore, we hypothesize that Pwrake also has advantages in actual bioinformatics workflows. Findings We implemented the Pwrake workflows to process next generation sequencing data using the Genomic Analysis Toolkit (GATK and Dindel. GATK and Dindel workflows are typical examples of sequential and parallel workflows, respectively. We found that in practice, actual scientific workflow development iterates over two phases, the workflow definition phase and the parameter adjustment phase. We introduced separate workflow definitions to help focus on each of the two developmental phases, as well as helper methods to simplify the descriptions. This approach increased iterative development efficiency. Moreover, we implemented combined workflows to demonstrate modularity of the GATK and Dindel workflows. Conclusions Pwrake enables agile management of scientific workflows in the bioinformatics domain. The internal domain specific language design built on Ruby gives the flexibility of rakefiles for writing scientific workflows

  14. Expression Data Analysis to Identify Biomarkers Associated with Asthma in Children

    Directory of Open Access Journals (Sweden)

    Wen Xu

    2014-01-01

    Full Text Available Asthma is characterized by recurrent episodes of wheezing, shortness of breath, chest tightness, and coughing. It is usually caused by a combination of complex and incompletely understood environmental and genetic interactions. We obtained gene expression data with high-throughput screening and identified biomarkers of children's asthma using bioinformatics tools. Next, we explained the pathogenesis of children's asthma from the perspective of gene regulatory networks: DAVID was applied to perform Kyoto Encyclopedia of Genes and Genomes (KEGG pathway enriching analysis for the top 3000 pairs of relationships in differentially regulatory network. Finally, we found that HAND1, PTK1, NFKB1, ZIC3, STAT6, E2F1, PELP1, USF2, and CBFB may play important roles in children's asthma initiation. On account of regulatory impact factor (RIF score, HAND1, PTK7, and ZIC3 were the potential asthma-related factors. Our study provided some foundations of a strategy for biomarker discovery despite a poor understanding of the mechanisms underlying children's asthma.

  15. Biophysics and bioinformatics of transcription regulation in bacteria and bacteriophages

    Science.gov (United States)

    Djordjevic, Marko

    2005-11-01

    Due to rapid accumulation of biological data, bioinformatics has become a very important branch of biological research. In this thesis, we develop novel bioinformatic approaches and aid design of biological experiments by using ideas and methods from statistical physics. Identification of transcription factor binding sites within the regulatory segments of genomic DNA is an important step towards understanding of the regulatory circuits that control expression of genes. We propose a novel, biophysics based algorithm, for the supervised detection of transcription factor (TF) binding sites. The method classifies potential binding sites by explicitly estimating the sequence-specific binding energy and the chemical potential of a given TF. In contrast with the widely used information theory based weight matrix method, our approach correctly incorporates saturation in the transcription factor/DNA binding probability. This results in a significant reduction in the number of expected false positives, and in the explicit appearance---and determination---of a binding threshold. The new method was used to identify likely genomic binding sites for the Escherichia coli TFs, and to examine the relationship between TF binding specificity and degree of pleiotropy (number of regulatory targets). We next address how parameters of protein-DNA interactions can be obtained from data on protein binding to random oligos under controlled conditions (SELEX experiment data). We show that 'robust' generation of an appropriate data set is achieved by a suitable modification of the standard SELEX procedure, and propose a novel bioinformatic algorithm for analysis of such data. Finally, we use quantitative data analysis, bioinformatic methods and kinetic modeling to analyze gene expression strategies of bacterial viruses. We study bacteriophage Xp10 that infects rice pathogen Xanthomonas oryzae. Xp10 is an unusual bacteriophage, which has morphology and genome organization that most closely

  16. A bioinformatics insight to rhizobial globins: gene identification and mapping, polypeptide sequence and phenetic analysis, and protein modeling. [v1; ref status: indexed, http://f1000r.es/5ai

    Directory of Open Access Journals (Sweden)

    Reinier Gesto-Borroto

    2015-05-01

    Full Text Available Globins (Glbs are proteins widely distributed in organisms. Three evolutionary families have been identified in Glbs: the M, S and T Glb families. The M Glbs include flavohemoglobins (fHbs and single-domain Glbs (SDgbs; the S Glbs include globin-coupled sensors (GCSs, protoglobins and sensor single domain globins, and the T Glbs include truncated Glbs (tHbs. Structurally, the M and S Glbs exhibit 3/3-folding whereas the T Glbs exhibit 2/2-folding. Glbs are widespread in bacteria, including several rhizobial genomes. However, only few rhizobial Glbs have been characterized. Hence, we characterized Glbs from 62 rhizobial genomes using bioinformatics methods such as data mining in databases, sequence alignment, phenogram construction and protein modeling. Also, we analyzed soluble extracts from Bradyrhizobium japonicum USDA38 and USDA58 by (reduced + carbon monoxide (CO minus reduced differential spectroscopy. Database searching showed that only fhb, sdgb, gcs and thb genes exist in the rhizobia analyzed in this work. Promoter analysis revealed that apparently several rhizobial glb genes are not regulated by a -10 promoter but might be regulated by -35 and Fnr (fumarate-nitrate reduction regulator-like promoters. Mapping analysis revealed that rhizobial fhbs and thbs are flanked by a variety of genes whereas several rhizobial sdgbs and gcss are flanked by genes coding for proteins involved in the metabolism of nitrates and nitrites and chemotaxis, respectively. Phenetic analysis showed that rhizobial Glbs segregate into the M, S and T Glb families, while structural analysis showed that predicted rhizobial SDgbs and fHbs and GCSs globin domain and tHbs fold into the 3/3- and 2/2-folding, respectively. Spectra from B. japonicum USDA38 and USDA58 soluble extracts exhibited peaks and troughs characteristic of bacterial and vertebrate Glbs thus indicating that putative Glbs are synthesized in B. japonicum USDA38 and USDA58.

  17. Bioinformatics Training: A Review of Challenges, Actions and Support Requirements

    DEFF Research Database (Denmark)

    Schneider, M.V.; Watson, J.; Attwood, T.;

    2010-01-01

    services, and discuss successful training strategies shared by a diverse set of bioinformatics trainers. We also identify steps that trainers in bioinformatics could take together to advance the state of the art in current training practices. The ideas presented in this article derive from the first...... Trainer Networking Session held under the auspices of the EU-funded SLING Integrating Activity, which took place in November 2009....

  18. Bioinformatics and Molecular Analysis of the Evolutionary Relationship between Bovine Rhinitis A Viruses and Foot-And-Mouth Disease Virus

    OpenAIRE

    Rai, Devendra K.; Paul Lawrence; Steve J. Pauszek; Piccone, Maria E.; Knowles, Nick J.; Elizabeth Rieder

    2016-01-01

    Bovine rhinitis viruses (BRVs) cause mild respiratory disease of cattle. In this study, a near full-length genome sequence of a virus named RS3X (formerly classified as bovine rhinovirus type 1), isolated from infected cattle from the UK in the 1960s, was obtained and analyzed. Compared to other closely related Aphthoviruses, major differences were detected in the leader protease (Lpro), P1, 2B, and 3A proteins. Phylogenetic analysis revealed that RS3X was a member of the species bovine rhini...

  19. [An overview of feature selection algorithm in bioinformatics].

    Science.gov (United States)

    Li, Xin; Ma, Li; Wang, Jinjia; Zhao, Chun

    2011-04-01

    Feature selection (FS) techniques have become an important tool in bioinformatics field. The core algorithm of it is to select the hidden significant data with low-dimension from high-dimensional data space, and thus to analyse the basic built-in rule of the data. The data of bioinformatics fields are always with high-dimension and small samples, so the research of FS algorithm in the bioinformatics fields has great foreground. In this article, we make the interested reader aware of the possibilities of feature selection, provide basic properties of feature selection techniques, and discuss their uses in the sequence analysis, microarray analysis, mass spectra analysis etc. Finally, the current problems and the prospects of feature selection algorithm in the application of bioinformatics is also discussed. PMID:21604512

  20. Global computing for bioinformatics.

    Science.gov (United States)

    Loewe, Laurence

    2002-12-01

    Global computing, the collaboration of idle PCs via the Internet in a SETI@home style, emerges as a new way of massive parallel multiprocessing with potentially enormous CPU power. Its relations to the broader, fast-moving field of Grid computing are discussed without attempting a review of the latter. This review (i) includes a short table of milestones in global computing history, (ii) lists opportunities global computing offers for bioinformatics, (iii) describes the structure of problems well suited for such an approach, (iv) analyses the anatomy of successful projects and (v) points to existing software frameworks. Finally, an evaluation of the various costs shows that global computing indeed has merit, if the problem to be solved is already coded appropriately and a suitable global computing framework can be found. Then, either significant amounts of computing power can be recruited from the general public, or--if employed in an enterprise-wide Intranet for security reasons--idle desktop PCs can substitute for an expensive dedicated cluster. PMID:12511066

  1. Survey of MapReduce frame operation in bioinformatics.

    Science.gov (United States)

    Zou, Quan; Li, Xu-Bin; Jiang, Wen-Rui; Lin, Zi-Yu; Li, Gui-Lin; Chen, Ke

    2014-07-01

    Bioinformatics is challenged by the fact that traditional analysis tools have difficulty in processing large-scale data from high-throughput sequencing. The open source Apache Hadoop project, which adopts the MapReduce framework and a distributed file system, has recently given bioinformatics researchers an opportunity to achieve scalable, efficient and reliable computing performance on Linux clusters and on cloud computing services. In this article, we present MapReduce frame-based applications that can be employed in the next-generation sequencing and other biological domains. In addition, we discuss the challenges faced by this field as well as the future works on parallel computing in bioinformatics. PMID:23396756

  2. A Survey of Scholarly Literature Describing the Field of Bioinformatics Education and Bioinformatics Educational Research

    Science.gov (United States)

    Magana, Alejandra J.; Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the…

  3. Screening feature genes of astrocytoma using a combined method of microarray gene expression profiling and bioinformatics analysis.

    Science.gov (United States)

    Cai, Yong; Zhong, Xingming; Wang, Yiqi; Yang, Jianguo

    2015-01-01

    The aim of our study was to find feature genes associated with astrocytoma and correlative gene functions which can distinguish cancer tissue from adjacent non-tumor astrocyte tissues. Gene expression profile GSE15824 was downloaded from Gene Expression Omnibus database which included 8 astrocytoma tissues and 3 adjacent non-tumor astrocyte samples. The raw data were first transformed into probe-level data and the differentially expressed genes (DEGs) between tissues of patients with astrocytoma and normal specimen were identified using T-test in samr package of R. The Database for Annotation, Visualization and Integrated Discovery (DAVID) was applied to analyze the gene ontology (GO) enrichment on gene functions and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Finally, corresponding protein-protein interaction (PPI) networks of DEGs was constructed using the Cytoscape based on the data collected from STRING online datasets. A total of 3072 genes, including 1799 up-regulated genes and 1273 down-regulated genes, were filtered as DEGs, and we learnt that the DEGs including AQP4, PMP2, SRARCL1 and SLC1A2CAMs etc and that AQP4 was most significantly related to cell osmotic pressure. Three feature genes in KEGG pathway are highly enriched in cancer specimen while two genes are in the normal tissues. The discovery of featured genes significantly related to the regulation of cell osmotic pressure, has the potential to use in clinic for diagnosis of astrocytoma in future. In addition, it has a great significance on studying mechanism, distinguishing normal and cancer tissues, and exploring new treatments for astrocytoma. However, further experiments were needed to confirm our result. PMID:26770395

  4. Adapting bioinformatics curricula for big data.

    Science.gov (United States)

    Greene, Anna C; Giffin, Kristine A; Greene, Casey S; Moore, Jason H

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs. PMID:25829469

  5. Bioinformatics and the discovery of gene function

    OpenAIRE

    Casari, G; Daruvar, Dea; Sander, C.; Schneider, Reinhard

    1996-01-01

    Scientific history was made in completing the yeast genuine sequence, yet its 13 Mb are a mere starting point. Two challenges loom large: to decipher the function of all genes and to describe the workings of the eukaryotic cell in full molecular detail. A combination of experimental and theoretical approaches will be brought to bear on these challenges. What will be next in yeast genome analysis from the point of view of bioinformatics?

  6. VLSI Microsystem for Rapid Bioinformatic Pattern Recognition

    Science.gov (United States)

    Fang, Wai-Chi; Lue, Jaw-Chyng

    2009-01-01

    A system comprising very-large-scale integrated (VLSI) circuits is being developed as a means of bioinformatics-oriented analysis and recognition of patterns of fluorescence generated in a microarray in an advanced, highly miniaturized, portable genetic-expression-assay instrument. Such an instrument implements an on-chip combination of polymerase chain reactions and electrochemical transduction for amplification and detection of deoxyribonucleic acid (DNA).

  7. Bioinformatic pipelines in Python with Leaf

    OpenAIRE

    Napolitano, Francesco; Mariani-Costantini, Renato; Tagliaferri, Roberto

    2013-01-01

    Background An incremental, loosely planned development approach is often used in bioinformatic studies when dealing with custom data analysis in a rapidly changing environment. Unfortunately, the lack of a rigorous software structuring can undermine the maintainability, communicability and replicability of the process. To ameliorate this problem we propose the Leaf system, the aim of which is to seamlessly introduce the pipeline formality on top of a dynamical development process with minimum...

  8. Expressional and Bioinformatic Analysis of Bovine Filia/Ecat1/Khdc3l Gene: A Comparison with Ovine Species.

    Science.gov (United States)

    Zahmatkesh, Azadeh; Ansari Mahyari, Saeid; Daliri Joupari, Morteza; Rahmani, Hamidreza; Shirazi, Abolfazl; Amiri Roudbar, Mahmood; Ansari Majd, Saeid

    2016-07-01

    Maternal effect genes have highly impressive effects on pre-implantation development. Filia/Ecat1/Khdc3l is a maternal effect gene found in mouse oocytes and embryos, loss of which causes a 50% decrease in fertility. In the present study, we investigated Filia mRNA expression in bovine oviduct, 30- to 40-day fetus, liver, heart, lung, and oocytes (as a positive control), by RT-PCR and detected it only in oocytes. A 443 bp fragment was amplified only in oocytes and was sequenced as a part of bovine predicted Filia mRNA. We analyzed bovine and ovine Filia N-terminal peptide sequence in PHYRE2, and a KH domain was predicted. Protein alignment using ClustalW indicated a highly identical N-terminal extention between the 2 species. Immunohistochemical analysis using anti-bovine Filia antibody showed the expression of Filia protein in the zone surrounding the nuclear membrane, and in the subcortex of ovine oocytes of primary and antral follicles. However, in the bovine, Filia has been found through the oocyte cytoplasm of antral follicles, and here it is further confirmed in the primary follicles. Our data suggests a difference in Filia expression pattern between cow and sheep, although the sequence is highly conserved. PMID:27070240

  9. Bioinformatics and Molecular Analysis of the Evolutionary Relationship between Bovine Rhinitis A Viruses and Foot-And-Mouth Disease Virus

    Science.gov (United States)

    Rai, Devendra K.; Lawrence, Paul; Pauszek, Steve J.; Piccone, Maria E.; Knowles, Nick J.; Rieder, Elizabeth

    2015-01-01

    Bovine rhinitis viruses (BRVs) cause mild respiratory disease of cattle. In this study, a near full-length genome sequence of a virus named RS3X (formerly classified as bovine rhinovirus type 1), isolated from infected cattle from the UK in the 1960s, was obtained and analyzed. Compared to other closely related Aphthoviruses, major differences were detected in the leader protease (Lpro), P1, 2B, and 3A proteins. Phylogenetic analysis revealed that RS3X was a member of the species bovine rhinitis A virus (BRAV). Using different codon-based and branch-site selection models for Aphthoviruses, including BRAV RS3X and foot-and-mouth disease virus, we observed no clear evidence for genomic regions undergoing positive selection. However, within each of the BRV species, multiple sites under positive selection were detected. The results also suggest that the probability (determined by Recombination Detection Program) for recombination events between BRVs and other Aphthoviruses, including foot-and-mouth disease virus was not significant. In contrast, within BRVs, the probability of recombination increases. The data reported here provide genetic information to assist in the identification of diagnostic signatures and research tools for BRAV. PMID:27081310

  10. Bioinformatics Approach in Plant Genomic Research.

    Science.gov (United States)

    Ong, Quang; Nguyen, Phuc; Thao, Nguyen Phuong; Le, Ly

    2016-08-01

    The advance in genomics technology leads to the dramatic change in plant biology research. Plant biologists now easily access to enormous genomic data to deeply study plant high-density genetic variation at molecular level. Therefore, fully understanding and well manipulating bioinformatics tools to manage and analyze these data are essential in current plant genome research. Many plant genome databases have been established and continued expanding recently. Meanwhile, analytical methods based on bioinformatics are also well developed in many aspects of plant genomic research including comparative genomic analysis, phylogenomics and evolutionary analysis, and genome-wide association study. However, constantly upgrading in computational infrastructures, such as high capacity data storage and high performing analysis software, is the real challenge for plant genome research. This review paper focuses on challenges and opportunities which knowledge and skills in bioinformatics can bring to plant scientists in present plant genomics era as well as future aspects in critical need for effective tools to facilitate the translation of knowledge from new sequencing data to enhancement of plant productivity. PMID:27499685

  11. Application Of Data Mining In Bioinformatics

    OpenAIRE

    KHALID RAZA

    2012-01-01

    This article highlights some of the basic concepts of bioinformatics and data mining. The major research areas of bioinformatics are highlighted. The application of data mining in the domain of bioinformatics is explained. It also highlights some of the current challenges and opportunities of data mining in bioinformatics.

  12. Novel bioinformatic developments for exome sequencing.

    Science.gov (United States)

    Lelieveld, Stefan H; Veltman, Joris A; Gilissen, Christian

    2016-06-01

    With the widespread adoption of next generation sequencing technologies by the genetics community and the rapid decrease in costs per base, exome sequencing has become a standard within the repertoire of genetic experiments for both research and diagnostics. Although bioinformatics now offers standard solutions for the analysis of exome sequencing data, many challenges still remain; especially the increasing scale at which exome data are now being generated has given rise to novel challenges in how to efficiently store, analyze and interpret exome data of this magnitude. In this review we discuss some of the recent developments in bioinformatics for exome sequencing and the directions that this is taking us to. With these developments, exome sequencing is paving the way for the next big challenge, the application of whole genome sequencing. PMID:27075447

  13. Bioinformatic Analysis of the Nitrate Reductase Gene in Antartic Ice Algae Chlamydomonas sp. ICE-L%南极衣藻Chlamydomonas sp.ICE-L硝酸还原酶基因的生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    林敏卓; 刘晨临; 黄晓航; 杨平平

    2012-01-01

    Nitrate reductase (NR) plays an important role in the abiotic stress adaptation in plants by regulating nitrogen metabolism. A nitrate reductase (NR) gene of Antarctic ice algae, Chlamydomonas sp. ICE-L, was identified from the cDNA library and sequenced. The encoded protein sequence of NR gene was investigated by bioinformatic analysis. Through sequence alignment the active sites of ICE-L NR protein sequence which may related to stress acclimation was identified. In addition, the tertiary structure of ICE- L NR protein sequence was predicted. The full-length of Chlamydomonas ICE-L NR gene contained an open reading frame of 2,589 bp encoding a nitrate reductase of 863 amino acids. Phylogenetic analysis showed that the gene was homologous to known green algae NRs with identity of 63%, 61%, 60% and 54% to Volvox carteri, Chlamydomonas reinhardtii, Dunaliella tertiolecta and Chlorella vulgaris respectively. The functional prediction analysis revealed that NR gene sequence has 3 different functional domains which was similar to higher plant. This bioinformatic analysis about NR gene of ICE- L will help us further understand and deeply expand the recearch on the acclimatizing mechanism of Antarctic ice alga Chlamydomonas in the extreme environment from the angle of NR gene.%硝酸还原酶(NR)除调节植物的氮代谢外,在植物的各种非生物胁迫的适应过程中也发挥着重要的作用.从南极冰藻Chlamydomonas sp.ICE-L的cDNA文库中筛选到了硝酸还原酶的全长基因,对其进行测序并对其编码的蛋白序列进行了生物信息学分析,构建了NR的系统进化树,通过多序列比对探讨了可能与该酶逆境适应性相关的活性位点,并对该蛋白进行了三级结构预测分析.结果显示,NR基因的编码区长2 589 bp,编码863个氨基酸.在以氨基酸序列构建的系统进化树中,南极衣藻的NR序列和其他绿藻类的聚在一起,与团藻、莱茵衣藻、杜氏盐藻和小球藻

  14. BIOELECTRICAL IMPEDANCE VECTOR ANALYSIS IDENTIFIES SARCOPENIA IN NURSING HOME RESIDENTS

    Science.gov (United States)

    Loss of muscle mass and water shifts between body compartments are contributing factors to frailty in the elderly. The body composition changes are especially pronounced in institutionalized elderly. We investigated the ability of single-frequency bioelectrical impedance analysis (BIA) to identify b...

  15. SNPTrackTM : an integrated bioinformatics system for genetic association studies

    Directory of Open Access Journals (Sweden)

    Xu Joshua

    2012-07-01

    Full Text Available Abstract A genetic association study is a complicated process that involves collecting phenotypic data, generating genotypic data, analyzing associations between genotypic and phenotypic data, and interpreting genetic biomarkers identified. SNPTrack is an integrated bioinformatics system developed by the US Food and Drug Administration (FDA to support the review and analysis of pharmacogenetics data resulting from FDA research or submitted by sponsors. The system integrates data management, analysis, and interpretation in a single platform for genetic association studies. Specifically, it stores genotyping data and single-nucleotide polymorphism (SNP annotations along with study design data in an Oracle database. It also integrates popular genetic analysis tools, such as PLINK and Haploview. SNPTrack provides genetic analysis capabilities and captures analysis results in its database as SNP lists that can be cross-linked for biological interpretation to gene/protein annotations, Gene Ontology, and pathway analysis data. With SNPTrack, users can do the entire stream of bioinformatics jobs for genetic association studies. SNPTrack is freely available to the public at http://www.fda.gov/ScienceResearch/BioinformaticsTools/SNPTrack/default.htm.

  16. Bioinformatic analysis of the nucleolus

    DEFF Research Database (Denmark)

    Leung, Anthony K L; Andersen, Jens S; Mann, Matthias;

    2003-01-01

    The nucleolus is a plurifunctional, nuclear organelle, which is responsible for ribosome biogenesis and many other functions in eukaryotes, including RNA processing, viral replication and tumour suppression. Our knowledge of the human nucleolar proteome has been expanded dramatically by the two r...

  17. Bioinformatics Training Network (BTN): a community resource for bioinformatics trainers

    DEFF Research Database (Denmark)

    Schneider, Maria V.; Walter, Peter; Blatter, Marie-Claude;

    2012-01-01

    to the development of ‘high-throughput biology’, the need for training in the field of bioinformatics, in particular, is seeing a resurgence: it has been defined as a key priority by many Institutions and research programmes and is now an important component of many grant proposals. Nevertheless, when it comes...... and clearly tagged in relation to target audiences, learning objectives, etc. Ideally, they would also be peer reviewed, and easily and efficiently accessible for downloading. Here, we present the Bioinformatics Training Network (BTN), a new enterprise that has been initiated to address these needs and review...

  18. Translational bioinformatics in psychoneuroimmunology: methods and applications.

    Science.gov (United States)

    Yan, Qing

    2012-01-01

    Translational bioinformatics plays an indispensable role in transforming psychoneuroimmunology (PNI) into personalized medicine. It provides a powerful method to bridge the gaps between various knowledge domains in PNI and systems biology. Translational bioinformatics methods at various systems levels can facilitate pattern recognition, and expedite and validate the discovery of systemic biomarkers to allow their incorporation into clinical trials and outcome assessments. Analysis of the correlations between genotypes and phenotypes including the behavioral-based profiles will contribute to the transition from the disease-based medicine to human-centered medicine. Translational bioinformatics would also enable the establishment of predictive models for patient responses to diseases, vaccines, and drugs. In PNI research, the development of systems biology models such as those of the neurons would play a critical role. Methods based on data integration, data mining, and knowledge representation are essential elements in building health information systems such as electronic health records and computerized decision support systems. Data integration of genes, pathophysiology, and behaviors are needed for a broad range of PNI studies. Knowledge discovery approaches such as network-based systems biology methods are valuable in studying the cross-talks among pathways in various brain regions involved in disorders such as Alzheimer's disease. PMID:22933157

  19. Identifying clinical course patterns in SMS data using cluster analysis

    DEFF Research Database (Denmark)

    Kent, Peter; Kongsted, Alice

    2012-01-01

    whole group, by including all SMS time points in their original form. It was a 'proof of concept' study to explore the potential, clinical relevance, strengths and weakness of such an approach. METHODS: This was a secondary analysis of longitudinal SMS data collected in two randomised controlled trials...... subgroups in the outcomes of research studies. Two previous studies have investigated detailed clinical course patterns in SMS data obtained from people seeking care for low back pain. One used a visual analysis approach and the other performed a cluster analysis of SMS data that had first been transformed...... conducted simultaneously from a single clinical population (n = 322) . Fortnightly SMS data collected over a year on 'days of problematic low back pain' and on 'days of sick leave' were analysed using Two-Step (probabilistic) Cluster Analysis. RESULTS: Clinical course patterns were identified that were...

  20. Comparison of Online and Onsite Bioinformatics Instruction for a Fully Online Bioinformatics Master’s Program

    OpenAIRE

    Obom, Kristina. M.; Cummings, Patrick J.

    2009-01-01

    The completely online Master of Science in Bioinformatics program differs from the onsite program only in the mode of content delivery. Analysis of student satisfaction indicates no statistically significant difference between most online and onsite student responses, however, online and onsite students do differ significantly in their responses to a few questions on the course evaluation queries. Analysis of student exam performance using three assessments indicates that there was no signifi...

  1. Identifying influential factors of business process performance using dependency analysis

    Science.gov (United States)

    Wetzstein, Branimir; Leitner, Philipp; Rosenberg, Florian; Dustdar, Schahram; Leymann, Frank

    2011-02-01

    We present a comprehensive framework for identifying influential factors of business process performance. In particular, our approach combines monitoring of process events and Quality of Service (QoS) measurements with dependency analysis to effectively identify influential factors. The framework uses data mining techniques to construct tree structures to represent dependencies of a key performance indicator (KPI) on process and QoS metrics. These dependency trees allow business analysts to determine how process KPIs depend on lower-level process metrics and QoS characteristics of the IT infrastructure. The structure of the dependencies enables a drill-down analysis of single factors of influence to gain a deeper knowledge why certain KPI targets are not met.

  2. Using factor analysis to identify neuromuscular synergies during treadmill walking

    Science.gov (United States)

    Merkle, L. A.; Layne, C. S.; Bloomberg, J. J.; Zhang, J. J.

    1998-01-01

    Neuroscientists are often interested in grouping variables to facilitate understanding of a particular phenomenon. Factor analysis is a powerful statistical technique that groups variables into conceptually meaningful clusters, but remains underutilized by neuroscience researchers presumably due to its complicated concepts and procedures. This paper illustrates an application of factor analysis to identify coordinated patterns of whole-body muscle activation during treadmill walking. Ten male subjects walked on a treadmill (6.4 km/h) for 20 s during which surface electromyographic (EMG) activity was obtained from the left side sternocleidomastoid, neck extensors, erector spinae, and right side biceps femoris, rectus femoris, tibialis anterior, and medial gastrocnemius. Factor analysis revealed 65% of the variance of seven muscles sampled aligned with two orthogonal factors, labeled 'transition control' and 'loading'. These two factors describe coordinated patterns of muscular activity across body segments that would not be evident by evaluating individual muscle patterns. The results show that factor analysis can be effectively used to explore relationships among muscle patterns across all body segments to increase understanding of the complex coordination necessary for smooth and efficient locomotion. We encourage neuroscientists to consider using factor analysis to identify coordinated patterns of neuromuscular activation that would be obscured using more traditional EMG analyses.

  3. Latent cluster analysis of ALS phenotypes identifies prognostically differing groups.

    Directory of Open Access Journals (Sweden)

    Jeban Ganesalingam

    Full Text Available BACKGROUND: Amyotrophic lateral sclerosis (ALS is a degenerative disease predominantly affecting motor neurons and manifesting as several different phenotypes. Whether these phenotypes correspond to different underlying disease processes is unknown. We used latent cluster analysis to identify groupings of clinical variables in an objective and unbiased way to improve phenotyping for clinical and research purposes. METHODS: Latent class cluster analysis was applied to a large database consisting of 1467 records of people with ALS, using discrete variables which can be readily determined at the first clinic appointment. The model was tested for clinical relevance by survival analysis of the phenotypic groupings using the Kaplan-Meier method. RESULTS: The best model generated five distinct phenotypic classes that strongly predicted survival (p<0.0001. Eight variables were used for the latent class analysis, but a good estimate of the classification could be obtained using just two variables: site of first symptoms (bulbar or limb and time from symptom onset to diagnosis (p<0.00001. CONCLUSION: The five phenotypic classes identified using latent cluster analysis can predict prognosis. They could be used to stratify patients recruited into clinical trials and generating more homogeneous disease groups for genetic, proteomic and risk factor research.

  4. Parameter Trajectory Analysis to Identify Treatment Effects of Pharmacological Interventions

    OpenAIRE

    Tiemann, Christian A.; Vanlier, Joep; Oosterveer, Maaike H.; Albert K Groen; Hilbers, Peter A. J.; Natal A W van Riel

    2013-01-01

    The field of medical systems biology aims to advance understanding of molecular mechanisms that drive disease progression and to translate this knowledge into therapies to effectively treat diseases. A challenging task is the investigation of long-term effects of a (pharmacological) treatment, to establish its applicability and to identify potential side effects. We present a new modeling approach, called Analysis of Dynamic Adaptations in Parameter Trajectories (ADAPT), to analyze the long-t...

  5. Three Systems of Insular Functional Connectivity Identified with Cluster Analysis

    OpenAIRE

    Deen, Ben; Pitskel, Naomi B.; Kevin A. Pelphrey

    2010-01-01

    Despite much research on the function of the insular cortex, few studies have investigated functional subdivisions of the insula in humans. The present study used resting-state functional connectivity magnetic resonance imaging (MRI) to parcellate the human insular lobe based on clustering of functional connectivity patterns. Connectivity maps were computed for each voxel in the insula based on resting-state functional MRI (fMRI) data and segregated using cluster analysis. We identified 3 ins...

  6. Bioinformatics Analysis of SAUR Gene Family in Brassica rapa%白菜SAUR基因家族的生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    赵敬会; 王瑞雪; 李荣冲; 梁晶龙; 张涛

    2012-01-01

    The aim was to lay the foundation for the function of auxin response genes in the SAUR gene family in the future. The conserved motif, isoelectric point of amino acids, molecular evolution, expression pattern and other basic properties of early auxin responsive gene SAUR family were analyzed by the bioinformatics method on Brassica rapa. The results showed that there were 111 alkaline proteins, 31 acidic proteins, and one neutral protein. The SAUR gene family divided into two major subfamilies from the phylogenetic tree, one of the subfamily was still in the continued differentiation. 61 SAUR genes were found EST evidence and the genes expression was rich in sites. Through research and analysis that gene duplication of the SAUR gene family was a major characteristic, 143 genes contained 51 homologous pairs.%旨在为今后开展白菜SAUR基因家族的功能奠定基础.本研究利用生物信息学的方法对白菜生长素早期应答基因SAUR(Small auxin-up RNA)家族的保守基序、氨基酸等电点、分子进化以及表达模式等基本性质进行了分析.结果表明,白菜143个SAUR蛋白质中,有111个蛋白质偏碱性,31个偏酸性,有1个呈中性;从对该基因家族构建的进化树中可以看出它分化出2个大的亚家族,其中一个亚家族还处在不断分化中;此外,找到61个SAUR基因的EST表达证据,且基因表达部位比较丰富.通过研究分析表明本研究可以得出结论基因重复是白菜SAUR基因家族的一大特点,143个基因中含有51个同源对.

  7. Generations of interdisciplinarity in bioinformatics

    Science.gov (United States)

    Bartlett, Andrew; Lewis, Jamie; Williams, Matthew L.

    2016-01-01

    Bioinformatics, a specialism propelled into relevance by the Human Genome Project and the subsequent -omic turn in the life science, is an interdisciplinary field of research. Qualitative work on the disciplinary identities of bioinformaticians has revealed the tensions involved in work in this “borderland.” As part of our ongoing work on the emergence of bioinformatics, between 2010 and 2011, we conducted a survey of United Kingdom-based academic bioinformaticians. Building on insights drawn from our fieldwork over the past decade, we present results from this survey relevant to a discussion of disciplinary generation and stabilization. Not only is there evidence of an attitudinal divide between the different disciplinary cultures that make up bioinformatics, but there are distinctions between the forerunners, founders and the followers; as inter/disciplines mature, they face challenges that are both inter-disciplinary and inter-generational in nature. PMID:27453689

  8. Robust Bioinformatics Recognition with VLSI Biochip Microsystem

    Science.gov (United States)

    Lue, Jaw-Chyng L.; Fang, Wai-Chi

    2006-01-01

    A microsystem architecture for real-time, on-site, robust bioinformatic patterns recognition and analysis has been proposed. This system is compatible with on-chip DNA analysis means such as polymerase chain reaction (PCR)amplification. A corresponding novel artificial neural network (ANN) learning algorithm using new sigmoid-logarithmic transfer function based on error backpropagation (EBP) algorithm is invented. Our results show the trained new ANN can recognize low fluorescence patterns better than the conventional sigmoidal ANN does. A differential logarithmic imaging chip is designed for calculating logarithm of relative intensities of fluorescence signals. The single-rail logarithmic circuit and a prototype ANN chip are designed, fabricated and characterized.

  9. Identifying Organizational Inefficiencies with Pictorial Process Analysis (PPA

    Directory of Open Access Journals (Sweden)

    David John Patrishkoff

    2013-11-01

    Full Text Available Pictorial Process Analysis (PPA was created by the author in 2004. PPA is a unique methodology which offers ten layers of additional analysis when compared to standard process mapping techniques.  The goal of PPA is to identify and eliminate waste, inefficiencies and risk in manufacturing or transactional business processes at 5 levels in an organization. The highest level being assessed is the process management, followed by the process work environment, detailed work habits, process performance metrics and general attitudes towards the process. This detailed process assessment and analysis is carried out during process improvement brainstorming efforts and Kaizen events. PPA creates a detailed visual efficiency rating for each step of the process under review.  A selection of 54 pictorial Inefficiency Icons (cards are available for use to highlight major inefficiencies and risks that are present in the business process under review. These inefficiency icons were identified during the author's independent research on the topic of why things go wrong in business. This paper will highlight how PPA was developed and show the steps required to conduct Pictorial Process Analysis on a sample manufacturing process. The author has successfully used PPA to dramatically improve business processes in over 55 different industries since 2004.  

  10. An innovative approach for testing bioinformatics programs using metamorphic testing

    Directory of Open Access Journals (Sweden)

    Liu Huai

    2009-01-01

    Full Text Available Abstract Background Recent advances in experimental and computational technologies have fueled the development of many sophisticated bioinformatics programs. The correctness of such programs is crucial as incorrectly computed results may lead to wrong biological conclusion or misguide downstream experimentation. Common software testing procedures involve executing the target program with a set of test inputs and then verifying the correctness of the test outputs. However, due to the complexity of many bioinformatics programs, it is often difficult to verify the correctness of the test outputs. Therefore our ability to perform systematic software testing is greatly hindered. Results We propose to use a novel software testing technique, metamorphic testing (MT, to test a range of bioinformatics programs. Instead of requiring a mechanism to verify whether an individual test output is correct, the MT technique verifies whether a pair of test outputs conform to a set of domain specific properties, called metamorphic relations (MRs, thus greatly increases the number and variety of test cases that can be applied. To demonstrate how MT is used in practice, we applied MT to test two open-source bioinformatics programs, namely GNLab and SeqMap. In particular we show that MT is simple to implement, and is effective in detecting faults in a real-life program and some artificially fault-seeded programs. Further, we discuss how MT can be applied to test programs from various domains of bioinformatics. Conclusion This paper describes the application of a simple, effective and automated technique to systematically test a range of bioinformatics programs. We show how MT can be implemented in practice through two real-life case studies. Since many bioinformatics programs, particularly those for large scale simulation and data analysis, are hard to test systematically, their developers may benefit from using MT as part of the testing strategy. Therefore our work

  11. Rice Transcriptome Analysis to Identify Possible Herbicide Quinclorac Detoxification Genes

    Directory of Open Access Journals (Sweden)

    Wenying eXu

    2015-09-01

    Full Text Available Quinclorac is a highly selective auxin-type herbicide, and is widely used in the effective control of barnyard grass in paddy rice fields, improving the world’s rice yield. The herbicide mode of action of quinclorac has been proposed and hormone interactions affect quinclorac signaling. Because of widespread use, quinclorac may be transported outside rice fields with the drainage waters, leading to soil and water pollution and environmental health problems.In this study, we used 57K Affymetrix rice whole-genome array to identify quinclorac signaling response genes to study the molecular mechanisms of action and detoxification of quinclorac in rice plants. Overall, 637 probe sets were identified with differential expression levels under either 6 or 24 h of quinclorac treatment. Auxin-related genes such as GH3 and OsIAAs responded to quinclorac treatment. Gene Ontology analysis showed that genes of detoxification-related family genes were significantly enriched, including cytochrome P450, GST, UGT, and ABC and drug transporter genes. Moreover, real-time RT-PCR analysis showed that top candidate P450 families such as CYP81, CYP709C and CYP72A genes were universally induced by different herbicides. Some Arabidopsis genes for the same P450 family were up-regulated under quinclorac treatment.We conduct rice whole-genome GeneChip analysis and the first global identification of quinclorac response genes. This work may provide potential markers for detoxification of quinclorac and biomonitors of environmental chemical pollution.

  12. Towards a Methodology for Identifying Program Constraints During Requirements Analysis

    Science.gov (United States)

    Romo, Lilly; Gates, Ann Q.; Della-Piana, Connie Kubo

    1997-01-01

    Requirements analysis is the activity that involves determining the needs of the customer, identifying the services that the software system should provide and understanding the constraints on the solution. The result of this activity is a natural language document, typically referred to as the requirements definition document. Some of the problems that exist in defining requirements in large scale software projects includes synthesizing knowledge from various domain experts and communicating this information across multiple levels of personnel. One approach that addresses part of this problem is called context monitoring and involves identifying the properties of and relationships between objects that the system will manipulate. This paper examines several software development methodologies, discusses the support that each provide for eliciting such information from experts and specifying the information, and suggests refinements to these methodologies.

  13. Parameter trajectory analysis to identify treatment effects of pharmacological interventions.

    Directory of Open Access Journals (Sweden)

    Christian A Tiemann

    Full Text Available The field of medical systems biology aims to advance understanding of molecular mechanisms that drive disease progression and to translate this knowledge into therapies to effectively treat diseases. A challenging task is the investigation of long-term effects of a (pharmacological treatment, to establish its applicability and to identify potential side effects. We present a new modeling approach, called Analysis of Dynamic Adaptations in Parameter Trajectories (ADAPT, to analyze the long-term effects of a pharmacological intervention. A concept of time-dependent evolution of model parameters is introduced to study the dynamics of molecular adaptations. The progression of these adaptations is predicted by identifying necessary dynamic changes in the model parameters to describe the transition between experimental data obtained during different stages of the treatment. The trajectories provide insight in the affected underlying biological systems and identify the molecular events that should be studied in more detail to unravel the mechanistic basis of treatment outcome. Modulating effects caused by interactions with the proteome and transcriptome levels, which are often less well understood, can be captured by the time-dependent descriptions of the parameters. ADAPT was employed to identify metabolic adaptations induced upon pharmacological activation of the liver X receptor (LXR, a potential drug target to treat or prevent atherosclerosis. The trajectories were investigated to study the cascade of adaptations. This provided a counter-intuitive insight concerning the function of scavenger receptor class B1 (SR-B1, a receptor that facilitates the hepatic uptake of cholesterol. Although activation of LXR promotes cholesterol efflux and -excretion, our computational analysis showed that the hepatic capacity to clear cholesterol was reduced upon prolonged treatment. This prediction was confirmed experimentally by immunoblotting measurements of SR-B1

  14. Comparison of Online and Onsite Bioinformatics Instruction for a Fully Online Bioinformatics Master’s Program

    Directory of Open Access Journals (Sweden)

    Kristina M. Obom

    2009-12-01

    Full Text Available The completely online Master of Science in Bioinformatics program differs from the onsite program only in the mode of content delivery. Analysis of student satisfaction indicates no statistically significant difference between most online and onsite student responses, however, online and onsite students do differ significantly in their responses to a few questions on the course evaluation queries. Analysis of student exam performance using three assessments indicates that there was no significant difference in grades earned by students in online and onsite courses. These results suggest that our model for online bioinformatics education provides students with a rigorous course of study that is comparable to onsite course instruction and possibly provides a more rigorous course load and more opportunities for participation.

  15. BioRuby: Bioinformatics software for the Ruby programming language

    NARCIS (Netherlands)

    Goto, N.; Prins, J.C.P.; Nakao, M.; Bonnal, R.; Aerts, J.; Katayama, A.

    2010-01-01

    The BioRuby software toolkit contains a comprehensive set of free development tools and libraries for bioinformatics and molecular biology, written in the Ruby programming language. BioRuby has components for sequence analysis, pathway analysis, protein modelling and phylogenetic analysis; it suppor

  16. Reproducible Bioinformatics Research for Biologists

    Science.gov (United States)

    This book chapter describes the current Big Data problem in Bioinformatics and the resulting issues with performing reproducible computational research. The core of the chapter provides guidelines and summaries of current tools/techniques that a noncomputational researcher would need to learn to pe...

  17. Lidar point density analysis: implications for identifying water bodies

    Science.gov (United States)

    Worstell, Bruce B.; Poppenga, Sandra; Evans, Gayla A.; Prince, Sandra

    2014-01-01

    Most airborne topographic light detection and ranging (lidar) systems operate within the near-infrared spectrum. Laser pulses from these systems frequently are absorbed by water and therefore do not generate reflected returns on water bodies in the resulting void regions within the lidar point cloud. Thus, an analysis of lidar voids has implications for identifying water bodies. Data analysis techniques to detect reduced lidar return densities were evaluated for test sites in Blackhawk County, Iowa, and Beltrami County, Minnesota, to delineate contiguous areas that have few or no lidar returns. Results from this study indicated a 5-meter radius moving window with fewer than 23 returns (28 percent of the moving window) was sufficient for delineating void regions. Techniques to provide elevation values for void regions to flatten water features and to force channel flow in the downstream direction also are presented.

  18. Bioinformatic Identification of Conserved Cis-Sequences in Coregulated Genes.

    Science.gov (United States)

    Bülow, Lorenz; Hehl, Reinhard

    2016-01-01

    Bioinformatics tools can be employed to identify conserved cis-sequences in sets of coregulated plant genes because more and more gene expression and genomic sequence data become available. Knowledge on the specific cis-sequences, their enrichment and arrangement within promoters, facilitates the design of functional synthetic plant promoters that are responsive to specific stresses. The present chapter illustrates an example for the bioinformatic identification of conserved Arabidopsis thaliana cis-sequences enriched in drought stress-responsive genes. This workflow can be applied for the identification of cis-sequences in any sets of coregulated genes. The workflow includes detailed protocols to determine sets of coregulated genes, to extract the corresponding promoter sequences, and how to install and run a software package to identify overrepresented motifs. Further bioinformatic analyses that can be performed with the results are discussed. PMID:27557771

  19. Preliminary Study of Bioinformatics Patents and Their Classifications Registered in the KIPRIS Database

    OpenAIRE

    Park, Hyun-Seok

    2012-01-01

    Whereas a vast amount of new information on bioinformatics is made available to the public through patents, only a small set of patents are cited in academic papers. A detailed analysis of registered bioinformatics patents, using the existing patent search system, can provide valuable information links between science and technology. However, it is extremely difficult to select keywords to capture bioinformatics patents, reflecting the convergence of several underlying technologies. No single...

  20. Longitudinal Metagenomic Analysis of Hospital Air Identifies Clinically Relevant Microbes

    Science.gov (United States)

    King, Paula; Pham, Long K.; Waltz, Shannon; Sphar, Dan; Yamamoto, Robert T.; Conrad, Douglas; Taplitz, Randy; Torriani, Francesca

    2016-01-01

    We describe the sampling of sixty-three uncultured hospital air samples collected over a six-month period and analysis using shotgun metagenomic sequencing. Our primary goals were to determine the longitudinal metagenomic variability of this environment, identify and characterize genomes of potential pathogens and determine whether they are atypical to the hospital airborne metagenome. Air samples were collected from eight locations which included patient wards, the main lobby and outside. The resulting DNA libraries produced 972 million sequences representing 51 gigabases. Hierarchical clustering of samples by the most abundant 50 microbial orders generated three major nodes which primarily clustered by type of location. Because the indoor locations were longitudinally consistent, episodic relative increases in microbial genomic signatures related to the opportunistic pathogens Aspergillus, Penicillium and Stenotrophomonas were identified as outliers at specific locations. Further analysis of microbial reads specific for Stenotrophomonas maltophilia indicated homology to a sequenced multi-drug resistant clinical strain and we observed broad sequence coverage of resistance genes. We demonstrate that a shotgun metagenomic sequencing approach can be used to characterize the resistance determinants of pathogen genomes that are uncharacteristic for an otherwise consistent hospital air microbial metagenomic profile. PMID:27482891

  1. Use of discriminant analysis to identify propensity for purchasing properties

    Directory of Open Access Journals (Sweden)

    Ricardo Floriani

    2015-03-01

    Full Text Available Properties usually represent a milestone for people and families due to the high added-value when compared with family income. The objective of this study is the proposition of a discrimination model, by a discriminant analysis of people with characteristics (according to independent variables classified as potential buyers of properties, as well as to identify the interest in the use of such property, if it will be assigned to housing or leisure activities such as a cottage or beach house, and/or for investment. Thus, the following research question is proposed: What are the characteristics that better describe the profile of people which intend to acquire properties? The study justifies itself by its economic relevance in the real estate industry, as well as to the players of the real estate Market that may develop products based on the profile of potential customers. As a statistical technique, discriminant analysis was applied to the data gathered by questionnaire, which was sent via e-mail. Three hundred and thirty four responses were gathered. Based on this study, it was observed that it is possible to identify the intention for acquired properties, as well the purpose for acquiring it, for housing or investments.

  2. Cluster analysis of clinical data identifies fibromyalgia subgroups.

    Directory of Open Access Journals (Sweden)

    Elisa Docampo

    Full Text Available INTRODUCTION: Fibromyalgia (FM is mainly characterized by widespread pain and multiple accompanying symptoms, which hinder FM assessment and management. In order to reduce FM heterogeneity we classified clinical data into simplified dimensions that were used to define FM subgroups. MATERIAL AND METHODS: 48 variables were evaluated in 1,446 Spanish FM cases fulfilling 1990 ACR FM criteria. A partitioning analysis was performed to find groups of variables similar to each other. Similarities between variables were identified and the variables were grouped into dimensions. This was performed in a subset of 559 patients, and cross-validated in the remaining 887 patients. For each sample and dimension, a composite index was obtained based on the weights of the variables included in the dimension. Finally, a clustering procedure was applied to the indexes, resulting in FM subgroups. RESULTS: VARIABLES CLUSTERED INTO THREE INDEPENDENT DIMENSIONS: "symptomatology", "comorbidities" and "clinical scales". Only the two first dimensions were considered for the construction of FM subgroups. Resulting scores classified FM samples into three subgroups: low symptomatology and comorbidities (Cluster 1, high symptomatology and comorbidities (Cluster 2, and high symptomatology but low comorbidities (Cluster 3, showing differences in measures of disease severity. CONCLUSIONS: We have identified three subgroups of FM samples in a large cohort of FM by clustering clinical data. Our analysis stresses the importance of family and personal history of FM comorbidities. Also, the resulting patient clusters could indicate different forms of the disease, relevant to future research, and might have an impact on clinical assessment.

  3. Application of bioinformatics in tropical medicine

    Institute of Scientific and Technical Information of China (English)

    Wiwanitkit V

    2008-01-01

    Bioinformatics is a usage of information technology to help solve biological problems by designing novel and in-cisive algorithms and methods of analyses.Bioinformatics becomes a discipline vital in the era of post-genom-ics.In this review article,the application of bioinformatics in tropical medicine will be presented and dis-cussed.

  4. Bioinformatics decoding the genome

    CERN Document Server

    CERN. Geneva; Deutsch, Sam; Michielin, Olivier; Thomas, Arthur; Descombes, Patrick

    2006-01-01

    Extracting the fundamental genomic sequence from the DNA From Genome to Sequence : Biology in the early 21st century has been radically transformed by the availability of the full genome sequences of an ever increasing number of life forms, from bacteria to major crop plants and to humans. The lecture will concentrate on the computational challenges associated with the production, storage and analysis of genome sequence data, with an emphasis on mammalian genomes. The quality and usability of genome sequences is increasingly conditioned by the careful integration of strategies for data collection and computational analysis, from the construction of maps and libraries to the assembly of raw data into sequence contigs and chromosome-sized scaffolds. Once the sequence is assembled, a major challenge is the mapping of biologically relevant information onto this sequence: promoters, introns and exons of protein-encoding genes, regulatory elements, functional RNAs, pseudogenes, transposons, etc. The methodological ...

  5. Review of bioinformatics data analysis in alternative splicing%可变剪接的生物信息数据分析综述

    Institute of Scientific and Technical Information of China (English)

    章天骄

    2012-01-01

    前体mRNA的可变剪接是扩大真核生物蛋白质组多样性的重要基因调控机制.可变剪接的错误调节可以引起多种人类疾病.由于高通量技术的发展,生物信息学成为可变剪接研究的主要手段.本文总结了可变剪接在生物信息学领域的研究方法,同时也分析并预测了可变剪接的发展方向.%Alternative pre - mRNA splicing is an important gene regulation mechanism for expanding proteomic diversity in higher eukaryotes. The misregulation of alternative splicing underlies many human diseases. With the development of high - throughput technology, bioinformatics becomes to the main method in study of alternative splicing. This article summarizes the bioinformatics methods in alternative splicing research, as well as analyzes and predicts the direction of alternative splicing.

  6. Analysis of an Image Secret Sharing Scheme to Identify Cheaters

    Directory of Open Access Journals (Sweden)

    Jung-San LEe

    2010-09-01

    Full Text Available Secret image sharing mechanisms have been widely applied to the military, e-commerce, and communications fields. Zhao et al. introduced the concept of cheater detection into image sharing schemes recently. This functionality enables the image owner and authorized members to identify the cheater in reconstructing the secret image. Here, we provide an analysis of Zhao et al.¡¦s method: an authorized participant is able to restore the secret image by him/herself. This contradicts the requirement of secret image sharing schemes. The authorized participant utilizes an exhaustive search to achieve the attempt, though, simulation results show that it can be done within a reasonable time period.

  7. Bioinformatics analysis of the BRX gene family in grape%葡萄BRX基因家族生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    李文芳; 陈佰鸿; 毛娟; 马宗桓; 杨世茂

    2015-01-01

    BRX gene family is a class of transcriptional factors that present only in plant, and it plays an important role in the regulation of cell proliferation and root elongation in Arabidopsis. With the approaches of bioinformatics, BRX gene family present in the grape genome was performed in silico cloning, genome localization, protein structure, physical and chemical characteristics, secondary structure as well as subcellular localization prediction and analysis. Moreover, the evolutionary relationships of BRX gene family derived from other plants were predicted. Genome mapping results showed that:6 BRX genes in grape genome were located on 3 chromosomes, VvBRX1 and VvBRX2 in chromosome 2, VvBRX4 and VvBRX3 in chromosome 9, VvBRX6 and VvBRX5 in chromosome 11. The encoded proteins contain 360-560 amino acids, the relative molecular weight (61 884.4) and the pI value (9.38) of VvBRX5 were the maximum, while the relative molecular weight ( 40 239. 1 ) and the pI value ( 6. 23 ) of VvBRX1 were the minimum. The study suggested that there were some differences between the amino acid sequences of different members, while they all were hydrophobic proteins. The 6 BRX amino acid sequences mainly contain alpha helix and random coil and did not have transmembrane domains and signal peptide. Gene structure analysis showed that the 6 BRX genes contained exons and introns structure. Subcellular localization analysis showed that six VvBRX genes are located in nucleus. Phylogenetic analysis showed that VvBRX1 and VvBRX2 had the closest relationship with populus euphratica, the homology was 96%. VvBRX3 and VvBRX4 were clustered a class with Ricinus communis, Jatropha curcas, Citrus sinensis, Theobroma cacao and Glycine max, indicating that the evolutionary relationships were very closer. VvBRX5 was significantly separated from other VvBRX genes. VvBRX6 had the closest relationship with Nelumbo nucifera. These experimental results provide a significant foundation for further research

  8. Bioinformatics analysis of the BRX gene family in grape%葡萄BRX基因家族生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    李文芳; 陈佰鸿; 毛娟; 马宗桓; 杨世茂

    2015-01-01

    BRX基因家族是一类植物特有的转录因子家族,在拟南芥中参与调节根细胞的增殖与伸长. 利用生物信息学方法对葡萄基因组中存在的BRX 基因家族进行了电子克隆,并对其进行了基因组的定位、蛋白质的结构、理化性质、二级结构及亚细胞定位的预测与分析,并对其与其它植物进化的亲缘关系进行了研究. 基因组定位结果发现:葡萄基因组中6个BRX基因集中分布在3条染色体上,其中VvBRX1和VvBRX2分布在第2条染色体上,VvBRX3和VvBRX4分布在第9条染色体上,VvBRX5和VvBRX6分布在第11条染色体上;编码蛋白的氨基酸数目为360~560个,VvBRX5 的相对分子量(61 884.4)和理论等电点(9.38)均最大,而VvBRX1 的相对分子量(40 239.1)和理论等电点(6.23)均最小. 研究显示,不同成员间氨基酸数目、氨基酸序列间存在一定的差异,但都为疏水性蛋白;α-螺旋和无规则卷曲为6个BRX氨基酸序列的主要组成部分;均不存在跨膜域及信号肽. 基因结构分析表明,6个BRX基因都含有外显子和内含子结构. 亚细胞定位分析表明:6个VvBRX基因均定位于细胞核. 系统进化分析结果表明,VvBRX1、VvBRX2基因与胡杨的亲缘关系最近,相似性达96%;VvBRX3、VvBRX4与蓖麻、麻疯树、柑橘、可可、大豆聚为一类,说明其进化关系较近;VvBRX5与其它VvBRX基因明显分开;VvBRX6基因与莲的亲缘关系最近. 试验结果为葡萄BRX 基因家族的克隆和功能分析奠定了一定的研究基础.%BRX gene family is a class of transcriptional factors that present only in plant, and it plays an important role in the regulation of cell proliferation and root elongation in Arabidopsis. With the approaches of bioinformatics, BRX gene family present in the grape genome was performed in silico cloning, genome localization, protein structure, physical and chemical characteristics, secondary structure as well as subcellular localization prediction

  9. Social network analysis in identifying influential webloggers: A preliminary study

    Science.gov (United States)

    Hasmuni, Noraini; Sulaiman, Nor Intan Saniah; Zaibidi, Nerda Zura

    2014-12-01

    In recent years, second generation of internet-based services such as weblog has become an effective communication tool to publish information on the Web. Weblogs have unique characteristics that deserve users' attention. Some of webloggers have seen weblogs as appropriate medium to initiate and expand business. These webloggers or also known as direct profit-oriented webloggers (DPOWs) communicate and share knowledge with each other through social interaction. However, survivability is the main issue among DPOW. Frequent communication with influential webloggers is one of the way to keep survive as DPOW. This paper aims to understand the network structure and identify influential webloggers within the network. Proper understanding of the network structure can assist us in knowing how the information is exchanged among members and enhance survivability among DPOW. 30 DPOW were involved in this study. Degree centrality and betweenness centrality measurement in Social Network Analysis (SNA) were used to examine the strength relation and identify influential webloggers within the network. Thus, webloggers with the highest value of these measurements are considered as the most influential webloggers in the network.

  10. Establishing bioinformatics research in the Asia Pacific

    OpenAIRE

    Tammi Martti; Ranganathan Shoba; Gribskov Michael; Tan Tin Wee

    2006-01-01

    Abstract In 1998, the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation was set up to champion the advancement of bioinformatics in the Asia Pacific. By 2002, APBioNet was able to gain sufficient critical mass to initiate the first International Conference on Bioinformatics (InCoB) bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2006 Conference was organized as the 5th annual conference of the Asia-...

  11. Hydroxysteroid dehydrogenases (HSDs) in bacteria: a bioinformatic perspective.

    Science.gov (United States)

    Kisiela, Michael; Skarka, Adam; Ebert, Bettina; Maser, Edmund

    2012-03-01

    Steroidal compounds including cholesterol, bile acids and steroid hormones play a central role in various physiological processes such as cell signaling, growth, reproduction, and energy homeostasis. Hydroxysteroid dehydrogenases (HSDs), which belong to the superfamily of short-chain dehydrogenases/reductases (SDR) or aldo-keto reductases (AKR), are important enzymes involved in the steroid hormone metabolism. HSDs function as an enzymatic switch that controls the access of receptor-active steroids to nuclear hormone receptors and thereby mediate a fine-tuning of the steroid response. The aim of this study was the identification of classified functional HSDs and the bioinformatic annotation of these proteins in all complete sequenced bacterial genomes followed by a phylogenetic analysis. For the bioinformatic annotation we constructed specific hidden Markov models in an iterative approach to provide a reliable identification for the specific catalytic groups of HSDs. Here, we show a detailed phylogenetic analysis of 3α-, 7α-, 12α-HSDs and two further functional related enzymes (3-ketosteroid-Δ(1)-dehydrogenase, 3-ketosteroid-Δ(4)(5α)-dehydrogenase) from the superfamily of SDRs. For some bacteria that have been previously reported to posses a specific HSD activity, we could annotate the corresponding HSD protein. The dominating phyla that were identified to express HSDs were that of Actinobacteria, Proteobacteria, and Firmicutes. Moreover, some evolutionarily more ancient microorganisms (e.g., Cyanobacteria and Euryachaeota) were found as well. A large number of HSD-expressing bacteria constitute the normal human gastro-intestinal flora. Another group of bacteria were originally isolated from natural habitats like seawater, soil, marine and permafrost sediments. These bacteria include polycyclic aromatic hydrocarbons-degrading species such as Pseudomonas, Burkholderia and Rhodococcus. In conclusion, HSDs are found in a wide variety of microorganisms including

  12. Application of Bioinformatics and Systems Biology in Medicinal Plant Studies

    Institute of Scientific and Technical Information of China (English)

    DENG You-ping; AI Jun-mei; XIAO Pei-gen

    2010-01-01

    One important purpose to investigate medicinal plants is to understand genes and enzymes that govern the biological metabolic process to produce bioactive compounds.Genome wide high throughput technologies such as genomics,transcriptomics,proteomics and metabolomics can help reach that goal.Such technologies can produce a vast amount of data which desperately need bioinformatics and systems biology to process,manage,distribute and understand these data.By dealing with the"omics"data,bioinformatics and systems biology can also help improve the quality of traditional medicinal materials,develop new approaches for the classification and authentication of medicinal plants,identify new active compounds,and cultivate medicinal plant species that tolerate harsh environmental conditions.In this review,the application of bioinformatics and systems biology in medicinal plants is briefly introduced.

  13. CROSSWORK for Glycans: Glycan Identificatin Through Mass Spectrometry and Bioinformatics

    DEFF Research Database (Denmark)

    Rasmussen, Morten; Thaysen-Andersen, Morten; Højrup, Peter

      We have developed "GLYCANthrope " - CROSSWORKS for glycans:  a bioinformatics tool, which assists in identifying N-linked glycosylated peptides as well as their glycan moieties from MS2 data of enzymatically digested glycoproteins. The program runs either as a stand-alone application or as a plug...

  14. The Bioinformatic Analysis of the blcap Gene%宫颈癌相关blcap基因的生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    刘娟; 熊金虎; 伍欣星

    2004-01-01

    BLCAP is a potential gene for suppression of cervical carcinoma, which was found by analysing the cervical carcinoma specimen with the oncogene and anti-oncogene cDNA microarray. Basing on the bioinformatical analyses, we try to predict the function of blcap gene. The results show that there are several genes that highly resemble with blcap. The comparability between the sequences of blcap and Homo sapiens mRNA (DKFZp564M053) or BC10 is 99% and 87%, respectively. The protein encoded by BLCAP is composed of Leu(19.5%), pro(9.19%), ser(8.04%)、 cys(8.04%) and other amino acids. The secondary structure of the N-terminal of BLCAP encoded protein is an alpha helix. In the C-terminal, it is beta sheet and in the middle, it is coil. The of the terminals is more hydrophobile than the middle region. Between 45-55aa, there is a transmembrane region. Therefore, we forecast the BLCAP is a member of transmembrane protein I. By analyzing the signal peptide and the procedure of blcap gene with the program of SignalP (V1.1), we found a cleavage site in 59-66aa. By using the program of Netpho, we predicted there might be three phospholate sites at 68aa, 73aa and 78aa. At 78-81aa, we found a typical [ST]-X [2] -[DE] structure—the phospholate site of tyrosine protein kinase, which might be related to its function. Bioinformatic studies of blcap provided the foundation for the function researches of BLCAP in laboratory.

  15. Bio-informatics analysis of a gene co-expression module in adipose tissue containing the diet-responsive gene Nnat

    Directory of Open Access Journals (Sweden)

    Withers Dominic J

    2010-12-01

    Full Text Available Abstract Background Obesity causes insulin resistance in target tissues - skeletal muscle, adipose tissue, liver and the brain. Insulin resistance predisposes to type-2 diabetes (T2D and cardiovascular disease (CVD. Adipose tissue inflammation is an essential characteristic of obesity and insulin resistance. Neuronatin (Nnat expression has been found to be altered in a number of conditions related to inflammatory or metabolic disturbance, but its physiological roles and regulatory mechanisms in adipose tissue, brain, pancreatic islets and other tissues are not understood. Results We identified transcription factor binding sites (TFBS conserved in the Nnat promoter, and transcription factors (TF abundantly expressed in adipose tissue. These include transcription factors concerned with the control of: adipogenesis (Pparγ, Klf15, Irf1, Creb1, Egr2, Gata3; lipogenesis (Mlxipl, Srebp1c; inflammation (Jun, Stat3; insulin signalling and diabetes susceptibility (Foxo1, Tcf7l2. We also identified NeuroD1 the only documented TF that controls Nnat expression. We identified KEGG pathways significantly associated with Nnat expression, including positive correlations with inflammation and negative correlations with metabolic pathways (most prominently oxidative phosphorylation, glycolysis and gluconeogenesis, pyruvate metabolism and protein turnover. 27 genes, including; Gstt1 and Sod3, concerned with oxidative stress; Sncg and Cxcl9 concerned with inflammation; Ebf1, Lgals12 and Fzd4 involved in adipogenesis; whose expression co-varies with Nnat were identified, and conserved transcription factor binding sites identified on their promoters. Functional networks relating to each of these genes were identified. Conclusions Our analysis shows that Nnat is an acute diet-responsive gene in white adipose tissue and hypothalamus; it may play an important role in metabolism, adipogenesis, and resolution of oxidative stress and inflammation in response to dietary

  16. Bioinformatics and Microarray Analysis of miRNAs in Aged Female Mice Model Implied New Molecular Mechanisms for Impaired Fracture Healing

    Science.gov (United States)

    He, Bing; Zhang, Zong-Kang; Liu, Jin; He, Yi-Xin; Tang, Tao; Li, Jie; Guo, Bao-Sheng; Lu, Ai-Ping; Zhang, Bao-Ting; Zhang, Ge

    2016-01-01

    Impaired fracture healing in aged females is still a challenge in clinics. MicroRNAs (miRNAs) play important roles in fracture healing. This study aims to identify the miRNAs that potentially contribute to the impaired fracture healing in aged females. Transverse femoral shaft fractures were created in adult and aged female mice. At post-fracture 0-, 2- and 4-week, the fracture sites were scanned by micro computed tomography to confirm that the fracture healing was impaired in aged female mice and the fracture calluses were collected for miRNA microarray analysis. A total of 53 significantly differentially expressed miRNAs and 5438 miRNA-target gene interactions involved in bone fracture healing were identified. A novel scoring system was designed to analyze the miRNA contribution to impaired fracture healing (RCIFH). Using this method, 11 novel miRNAs were identified to impair fracture healing at 2- or 4-week post-fracture. Thereafter, function analysis of target genes was performed for miRNAs with high RCIFH values. The results showed that high RCIFH miRNAs in aged female mice might impair fracture healing not only by down-regulating angiogenesis-, chondrogenesis-, and osteogenesis-related pathways, but also by up-regulating osteoclastogenesis-related pathway, which implied the essential roles of these high RCIFH miRNAs in impaired fracture healing in aged females, and might promote the discovery of novel therapeutic strategies. PMID:27527150

  17. Performance Analysis: Work Control Events Identified January - August 2010

    Energy Technology Data Exchange (ETDEWEB)

    De Grange, C E; Freeman, J W; Kerr, C E; Holman, G; Marsh, K; Beach, R

    2011-01-14

    This performance analysis evaluated 24 events that occurred at LLNL from January through August 2010. The analysis identified areas of potential work control process and/or implementation weaknesses and several common underlying causes. Human performance improvement and safety culture factors were part of the causal analysis of each event and were analyzed. The collective significance of all events in 2010, as measured by the occurrence reporting significance category and by the proportion of events that have been reported to the DOE ORPS under the ''management concerns'' reporting criteria, does not appear to have increased in 2010. The frequency of reporting in each of the significance categories has not changed in 2010 compared to the previous four years. There is no change indicating a trend in the significance category and there has been no increase in the proportion of occurrences reported in the higher significance category. Also, the frequency of events, 42 events reported through August 2010, is not greater than in previous years and is below the average of 63 occurrences per year at LLNL since 2006. Over the previous four years, an average of 43% of the LLNL's reported occurrences have been reported as either ''management concerns'' or ''near misses.'' In 2010, 29% of the occurrences have been reported as ''management concerns'' or ''near misses.'' This rate indicates that LLNL is now reporting fewer ''management concern'' and ''near miss'' occurrences compared to the previous four years. From 2008 to the present, LLNL senior management has undertaken a series of initiatives to strengthen the work planning and control system with the primary objective to improve worker safety. In 2008, the LLNL Deputy Director established the Work Control Integrated Project Team to develop the core requirements and graded

  18. Directional reflectance analysis for identifying counterfeit drugs: Preliminary study.

    Science.gov (United States)

    Wilczyński, Sławomir; Koprowski, Robert; Błońska-Fajfrowska, Barbara

    2016-05-30

    The WHO estimates that up to 10% of drugs on the market may be counterfeit. In order to prevent intensification of the phenomenon of drug counterfeiting, the methods for distinguishing genuine medicines from fake ones need to be developed. The aim of this study was to try to develop simple, reproducible and inexpensive method for distinguishing between original and counterfeit medicines based on the measurement of directional reflectance. The directional reflectance of 6 original Viagra(®) tablets (Pfizer) and 24 (4 different batches) counterfeit tablets (imitating Viagra(®)) was examined in six spectral bands: from 0.9 to 1.1 μm, from 1.9 to 2.6 μm, from 3.0 to 4.0 μm, from 3.0 to 5.0 μm, from 4.0 to 5.0 μm, from 8.0 to 12.0 μm, and for two angles of incidence, 20° and 60°. Directional hemispherical reflectometer was applied to measure directional reflectance. Significant statistical differences between the directional reflectance of the original Viagra(®) and counterfeit tablets were registered. Any difference in the value of directional reflectance for any spectral band or angle of incidence identifies the drug as a fake one. The proposed method of directional reflectance analysis enables to differentiate between the real Viagra(®) and fake tablets. Directional reflectance analysis is a fast (measurement time under 5s), cheap and reproducible method which does not require expensive equipment or specialized laboratory staff. It also seems to be an effective method, however, the effectiveness will be assessed after the extension of research. PMID:26977587

  19. Technosciences in Academia: Rethinking a Conceptual Framework for Bioinformatics Undergraduate Curricula

    Science.gov (United States)

    Symeonidis, Iphigenia Sofia

    This paper aims to elucidate guiding concepts for the design of powerful undergraduate bioinformatics degrees which will lead to a conceptual framework for the curriculum. "Powerful" here should be understood as having truly bioinformatics objectives rather than enrichment of existing computer science or life science degrees on which bioinformatics degrees are often based. As such, the conceptual framework will be one which aims to demonstrate intellectual honesty in regards to the field of bioinformatics. A synthesis/conceptual analysis approach was followed as elaborated by Hurd (1983). The approach takes into account the following: bioinfonnatics educational needs and goals as expressed by different authorities, five undergraduate bioinformatics degrees case-studies, educational implications of bioinformatics as a technoscience and approaches to curriculum design promoting interdisciplinarity and integration. Given these considerations, guiding concepts emerged and a conceptual framework was elaborated. The practice of bioinformatics was given a closer look, which led to defining tool-integration skills and tool-thinking capacity as crucial areas of the bioinformatics activities spectrum. It was argued, finally, that a process-based curriculum as a variation of a concept-based curriculum (where the concepts are processes) might be more conducive to the teaching of bioinformatics given a foundational first year of integrated science education as envisioned by Bialek and Botstein (2004). Furthermore, the curriculum design needs to define new avenues of communication and learning which bypass the traditional disciplinary barriers of academic settings as undertaken by Tador and Tidmor (2005) for graduate studies.

  20. A Sensitivity Analysis Approach to Identify Key Environmental Performance Factors

    Directory of Open Access Journals (Sweden)

    Xi Yu

    2014-01-01

    Full Text Available Life cycle assessment (LCA is widely used in design phase to reduce the product’s environmental impacts through the whole product life cycle (PLC during the last two decades. The traditional LCA is restricted to assessing the environmental impacts of a product and the results cannot reflect the effects of changes within the life cycle. In order to improve the quality of ecodesign, it is a growing need to develop an approach which can reflect the changes between the design parameters and product’s environmental impacts. A sensitivity analysis approach based on LCA and ecodesign is proposed in this paper. The key environmental performance factors which have significant influence on the products’ environmental impacts can be identified by analyzing the relationship between environmental impacts and the design parameters. Users without much environmental knowledge can use this approach to determine which design parameter should be first considered when (redesigning a product. A printed circuit board (PCB case study is conducted; eight design parameters are chosen to be analyzed by our approach. The result shows that the carbon dioxide emission during the PCB manufacture is highly sensitive to the area of PCB panel.

  1. Identifying redundancy and exposing provenance in crowdsourced data analysis.

    Science.gov (United States)

    Willett, Wesley; Ginosar, Shiry; Steinitz, Avital; Hartmann, Björn; Agrawala, Maneesh

    2013-12-01

    We present a system that lets analysts use paid crowd workers to explore data sets and helps analysts interactively examine and build upon workers' insights. We take advantage of the fact that, for many types of data, independent crowd workers can readily perform basic analysis tasks like examining views and generating explanations for trends and patterns. However, workers operating in parallel can often generate redundant explanations. Moreover, because workers have different competencies and domain knowledge, some responses are likely to be more plausible than others. To efficiently utilize the crowd's work, analysts must be able to quickly identify and consolidate redundant responses and determine which explanations are the most plausible. In this paper, we demonstrate several crowd-assisted techniques to help analysts make better use of crowdsourced explanations: (1) We explore crowd-assisted strategies that utilize multiple workers to detect redundant explanations. We introduce color clustering with representative selection--a strategy in which multiple workers cluster explanations and we automatically select the most-representative result--and show that it generates clusterings that are as good as those produced by experts. (2) We capture explanation provenance by introducing highlighting tasks and capturing workers' browsing behavior via an embedded web browser, and refine that provenance information via source-review tasks. We expose this information in an explanation-management interface that allows analysts to interactively filter and sort responses, select the most plausible explanations, and decide which to explore further. PMID:24051786

  2. Probabilistic models and machine learning in structural bioinformatics

    DEFF Research Database (Denmark)

    Hamelryck, Thomas

    2009-01-01

    Structural bioinformatics is concerned with the molecular structure of biomacromolecules on a genomic scale, using computational methods. Classic problems in structural bioinformatics include the prediction of protein and RNA structure from sequence, the design of artificial proteins or enzymes...... and experimental determination of macromolecular structure that are based on such methods. These developments include generative models of protein structure, the estimation of the parameters of energy functions that are used in structure prediction, the superposition of macromolecules and structure...... bioinformatics. Recently, probabilistic models and machine learning methods based on Bayesian principles are providing efficient and rigorous solutions to challenging problems that were long regarded as intractable. In this review, I will highlight some important recent developments in the prediction, analysis...

  3. Approaches in integrative bioinformatics towards the virtual cell

    CERN Document Server

    Chen, Ming

    2014-01-01

    Approaches in Integrative Bioinformatics provides a basic introduction to biological information systems, as well as guidance for the computational analysis of systems biology. This book also covers a range of issues and methods that reveal the multitude of omics data integration types and the relevance that integrative bioinformatics has today. Topics include biological data integration and manipulation, modeling and simulation of metabolic networks, transcriptomics and phenomics, and virtual cell approaches, as well as a number of applications of network biology. It helps to illustrat

  4. Bioinformatics-driven identification and examination of candidate genes for non-alcoholic fatty liver disease.

    Directory of Open Access Journals (Sweden)

    Karina Banasik

    Full Text Available OBJECTIVE: Candidate genes for non-alcoholic fatty liver disease (NAFLD identified by a bioinformatics approach were examined for variant associations to quantitative traits of NAFLD-related phenotypes. RESEARCH DESIGN AND METHODS: By integrating public database text mining, trans-organism protein-protein interaction transferal, and information on liver protein expression a protein-protein interaction network was constructed and from this a smaller isolated interactome was identified. Five genes from this interactome were selected for genetic analysis. Twenty-one tag single-nucleotide polymorphisms (SNPs which captured all common variation in these genes were genotyped in 10,196 Danes, and analyzed for association with NAFLD-related quantitative traits, type 2 diabetes (T2D, central obesity, and WHO-defined metabolic syndrome (MetS. RESULTS: 273 genes were included in the protein-protein interaction analysis and EHHADH, ECHS1, HADHA, HADHB, and ACADL were selected for further examination. A total of 10 nominal statistical significant associations (P<0.05 to quantitative metabolic traits were identified. Also, the case-control study showed associations between variation in the five genes and T2D, central obesity, and MetS, respectively. Bonferroni adjustments for multiple testing negated all associations. CONCLUSIONS: Using a bioinformatics approach we identified five candidate genes for NAFLD. However, we failed to provide evidence of associations with major effects between SNPs in these five genes and NAFLD-related quantitative traits, T2D, central obesity, and MetS.

  5. Intrageneric Primer Design: Bringing Bioinformatics Tools to the Class

    Science.gov (United States)

    Lima, Andre O. S.; Garces, Sergio P. S.

    2006-01-01

    Bioinformatics is one of the fastest growing scientific areas over the last decade. It focuses on the use of informatics tools for the organization and analysis of biological data. An example of their importance is the availability nowadays of dozens of software programs for genomic and proteomic studies. Thus, there is a growing field (private…

  6. Isolation, characterization, and bioinformatic analysis of calmodulin-binding protein cmbB reveals a novel tandem IP22 repeat common to many Dictyostelium and Mimivirus proteins.

    Science.gov (United States)

    O'Day, Danton H; Suhre, Karsten; Myre, Michael A; Chatterjee-Chakraborty, Munmun; Chavez, Sara E

    2006-08-01

    A novel calmodulin-binding protein cmbB from Dictyostelium discoideum is encoded in a single gene. Northern analysis reveals two cmbB transcripts first detectable at 4 h during multicellular development. Western blotting detects an approximately 46.6 kDa protein. Sequence analysis and calmodulin-agarose binding studies identified a "classic" calcium-dependent calmodulin-binding domain (179IPKSLRSLFLGKGYNQPLEF198) but structural analyses suggest binding may not involve classic alpha-helical calmodulin-binding. The cmbB protein is comprised of tandem repeats of a newly identified IP22 motif ([I,L]Pxxhxxhxhxxxhxxxhxxxx; where h = any hydrophobic amino acid) that is highly conserved and a more precise representation of the FNIP repeat. At least eight Acanthamoeba polyphaga Mimivirus proteins and over 100 Dictyostelium proteins contain tandem arrays of the IP22 motif and its variants. cmbB also shares structural homology to YopM, from the plague bacterium Yersenia pestis. PMID:16777069

  7. 心肌桥粒盘状球蛋白 JUP 的生物信息学分析%Bioinformatics Analysis of Plakoglobin Gene and Protein

    Institute of Scientific and Technical Information of China (English)

    任晨霞; 曹文君

    2016-01-01

    目的::对 J up 基因及其蛋白进行生物信息学分析,为研究 J up 基因功能及其在心肌病形成和发展中的作用提供一定的理论基础。方法:运用生物信息学相关数据库和软件对 J up 基因的结构、单核苷酸多态性、JUP 蛋白分子的理化性质、二级结构、序列保守性、蛋白质相互作用网络进行分析。结果:人J up 基因编码区存在11个 SNPs 位点。J up 基因编码745个氨基酸组成的多肽,属亲水蛋白,稳定性不高,其主要二级结构元件为α-螺旋,进化中高度保守,属于 ARM 超家族。与 JUP 存在相互作用的基因和蛋白主要是桥粒组成成分与经典钙粘素信号途径组分。结论:J up 基因突变和 JUP 蛋白表达量的改变可引起相关的心肌病,本文对 J up 基因及其蛋白进行系统的生物信息学分析,为进一步实验研究其在心肌病的形成和发展的调控机制奠定基础。%Objective:To analyze the Jup gene and its protein with bioinformatics,and explore its action in process of cardiomyopathy and development.Methods:Bioinformatics methods were applied to analyze the genetic structure and single nucleotide polymorphisms of Jup,and physicochemical properties,secondary structure,hereditary conservation,protein interaction networks of JUP.Results:Eleven SNPs were found in the coding regions,including five missense mutations.JUP protein was comprised of 745 amino acid residues and was a hydrophilic unstable protein.The main secondary structure elements were alpha helix,and it was highly conserved in evolution and belonged to the ARM superfamily.The interaction network with JUP were mainly desmosome components and classical cadherin signaling pathway components.Conclusion:The changed expression of JUP can cause certain cardiomyopathy,so we analyze the insightful information of Jup gene and its protein by bioinformatics in this paper,laying a foundation for further experimental study

  8. Identifying Phytoplankton Classes In California Reservoirs Using HPLC Pigment Analysis

    Science.gov (United States)

    Siddiqui, S.; Peacock, M. B.; Kudela, R. M.; Negrey, K.

    2014-12-01

    Few bodies of water are routinely monitored for phytoplankton composition due to monetary and time constraints, especially the less accessible bodies of water in central and southern California. These lakes and estuaries are important for economic reasons such as tourism and fishing. This project investigated the composition of phytoplankton present using pigment analysis to identify dominant phytoplankton groups. A total of 28 different sites with a wide range of salinity (0 - 60) in central and southern California were examined. These included 13 different bodies of water in central California: 6 in the Sierras, 7 in the San Francisco Bay Estuary, and 15 from southern California. The samples were analyzed using high-performance liquid-chromatography (HPLC) to quantify the pigments present (using retention time and the spectral thumbprint). Diagnostic pigments were used to indicate the phytoplankton class composition, focusing on diatoms, dinoflagellates, cryptophytes, and cyanobacteria - all key phytoplankton groups indicative of the health of the sampled reservoir. Our results indicated that cyanobacteria dominated four of the seven bodies of central California water (Mono Lake, Bridgeport Reservoir, Steamboat Slough, and Pinto Lake); cryptophytes and nannoflagellates dominated two of the central California bodies of water (Mare Island Strait and Topaz Lake); and diatoms and dinoflagellates dominated one central California body of water, Oakland Inner Harbor, comprising more than 70% of the phytoplankton present. We expect the bodies of water from Southern California to be as disparate. Though this data is only a snapshot, it has significant implications in comparing different ecosystems across California, and it has the potential to provide valuable insight into the composition of phytoplankton communities.

  9. Bioinformatics in Africa: The Rise of Ghana?

    Directory of Open Access Journals (Sweden)

    Thomas K Karikari

    2015-09-01

    Full Text Available Until recently, bioinformatics, an important discipline in the biological sciences, was largely limited to countries with advanced scientific resources. Nonetheless, several developing countries have lately been making progress in bioinformatics training and applications. In Africa, leading countries in the discipline include South Africa, Nigeria, and Kenya. However, one country that is less known when it comes to bioinformatics is Ghana. Here, I provide a first description of the development of bioinformatics activities in Ghana and how these activities contribute to the overall development of the discipline in Africa. Over the past decade, scientists in Ghana have been involved in publications incorporating bioinformatics analyses, aimed at addressing research questions in biomedical science and agriculture. Scarce research funding and inadequate training opportunities are some of the challenges that need to be addressed for Ghanaian scientists to continue developing their expertise in bioinformatics.

  10. Technical phosphoproteomic and bioinformatic tools useful in cancer research

    Directory of Open Access Journals (Sweden)

    López Elena

    2011-10-01

    Full Text Available Abstract Reversible protein phosphorylation is one of the most important forms of cellular regulation. Thus, phosphoproteomic analysis of protein phosphorylation in cells is a powerful tool to evaluate cell functional status. The importance of protein kinase-regulated signal transduction pathways in human cancer has led to the development of drugs that inhibit protein kinases at the apex or intermediary levels of these pathways. Phosphoproteomic analysis of these signalling pathways will provide important insights for operation and connectivity of these pathways to facilitate identification of the best targets for cancer therapies. Enrichment of phosphorylated proteins or peptides from tissue or bodily fluid samples is required. The application of technologies such as phosphoenrichments, mass spectrometry (MS coupled to bioinformatics tools is crucial for the identification and quantification of protein phosphorylation sites for advancing in such relevant clinical research. A combination of different phosphopeptide enrichments, quantitative techniques and bioinformatic tools is necessary to achieve good phospho-regulation data and good structural analysis of protein studies. The current and most useful proteomics and bioinformatics techniques will be explained with research examples. Our aim in this article is to be helpful for cancer research via detailing proteomics and bioinformatic tools.

  11. Technical phosphoproteomic and bioinformatic tools useful in cancer research.

    Science.gov (United States)

    López, Elena; Wesselink, Jan-Jaap; López, Isabel; Mendieta, Jesús; Gómez-Puertas, Paulino; Muñoz, Sarbelio Rodríguez

    2011-01-01

    Reversible protein phosphorylation is one of the most important forms of cellular regulation. Thus, phosphoproteomic analysis of protein phosphorylation in cells is a powerful tool to evaluate cell functional status. The importance of protein kinase-regulated signal transduction pathways in human cancer has led to the development of drugs that inhibit protein kinases at the apex or intermediary levels of these pathways. Phosphoproteomic analysis of these signalling pathways will provide important insights for operation and connectivity of these pathways to facilitate identification of the best targets for cancer therapies. Enrichment of phosphorylated proteins or peptides from tissue or bodily fluid samples is required. The application of technologies such as phosphoenrichments, mass spectrometry (MS) coupled to bioinformatics tools is crucial for the identification and quantification of protein phosphorylation sites for advancing in such relevant clinical research. A combination of different phosphopeptide enrichments, quantitative techniques and bioinformatic tools is necessary to achieve good phospho-regulation data and good structural analysis of protein studies. The current and most useful proteomics and bioinformatics techniques will be explained with research examples. Our aim in this article is to be helpful for cancer research via detailing proteomics and bioinformatic tools. PMID:21967744

  12. Establishing bioinformatics research in the Asia Pacific

    Directory of Open Access Journals (Sweden)

    Tammi Martti

    2006-12-01

    Full Text Available Abstract In 1998, the Asia Pacific Bioinformatics Network (APBioNet, Asia's oldest bioinformatics organisation was set up to champion the advancement of bioinformatics in the Asia Pacific. By 2002, APBioNet was able to gain sufficient critical mass to initiate the first International Conference on Bioinformatics (InCoB bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2006 Conference was organized as the 5th annual conference of the Asia-Pacific Bioinformatics Network, on Dec. 18–20, 2006 in New Delhi, India, following a series of successful events in Bangkok (Thailand, Penang (Malaysia, Auckland (New Zealand and Busan (South Korea. This Introduction provides a brief overview of the peer-reviewed manuscripts accepted for publication in this Supplement. It exemplifies a typical snapshot of the growing research excellence in bioinformatics of the region as we embark on a trajectory of establishing a solid bioinformatics research culture in the Asia Pacific that is able to contribute fully to the global bioinformatics community.

  13. Comprehensive analysis of the N-glycan biosynthetic pathway using bioinformatics to generate UniCorn: A theoretical N-glycan structure database.

    Science.gov (United States)

    Akune, Yukie; Lin, Chi-Hung; Abrahams, Jodie L; Zhang, Jingyu; Packer, Nicolle H; Aoki-Kinoshita, Kiyoko F; Campbell, Matthew P

    2016-08-01

    Glycan structures attached to proteins are comprised of diverse monosaccharide sequences and linkages that are produced from precursor nucleotide-sugars by a series of glycosyltransferases. Databases of these structures are an essential resource for the interpretation of analytical data and the development of bioinformatics tools. However, with no template to predict what structures are possible the human glycan structure databases are incomplete and rely heavily on the curation of published, experimentally determined, glycan structure data. In this work, a library of 45 human glycosyltransferases was used to generate a theoretical database of N-glycan structures comprised of 15 or less monosaccharide residues. Enzyme specificities were sourced from major online databases including Kyoto Encyclopedia of Genes and Genomes (KEGG) Glycan, Consortium for Functional Glycomics (CFG), Carbohydrate-Active enZymes (CAZy), GlycoGene DataBase (GGDB) and BRENDA. Based on the known activities, more than 1.1 million theoretical structures and 4.7 million synthetic reactions were generated and stored in our database called UniCorn. Furthermore, we analyzed the differences between the predicted glycan structures in UniCorn and those contained in UniCarbKB (www.unicarbkb.org), a database which stores experimentally described glycan structures reported in the literature, and demonstrate that UniCorn can be used to aid in the assignment of ambiguous structures whilst also serving as a discovery database. PMID:27318307

  14. 'In silico expression analysis', a novel PathoPlant web tool to identify abiotic and biotic stress conditions associated with specific cis-regulatory sequences.

    Science.gov (United States)

    Bolívar, Julio C; Machens, Fabian; Brill, Yuri; Romanov, Artyom; Bülow, Lorenz; Hehl, Reinhard

    2014-01-01

    Using bioinformatics, putative cis-regulatory sequences can be easily identified using pattern recognition programs on promoters of specific gene sets. The abundance of predicted cis-sequences is a major challenge to associate these sequences with a possible function in gene expression regulation. To identify a possible function of the predicted cis-sequences, a novel web tool designated 'in silico expression analysis' was developed that correlates submitted cis-sequences with gene expression data from Arabidopsis thaliana. The web tool identifies the A. thaliana genes harbouring the sequence in a defined promoter region and compares the expression of these genes with microarray data. The result is a hierarchy of abiotic and biotic stress conditions to which these genes are most likely responsive. When testing the performance of the web tool, known cis-regulatory sequences were submitted to the 'in silico expression analysis' resulting in the correct identification of the associated stress conditions. When using a recently identified novel elicitor-responsive sequence, a WT-box (CGACTTTT), the 'in silico expression analysis' predicts that genes harbouring this sequence in their promoter are most likely Botrytis cinerea induced. Consistent with this prediction, the strongest induction of a reporter gene harbouring this sequence in the promoter is observed with B. cinerea in transgenic A. thaliana. DATABASE URL: http://www.pathoplant.de/expression_analysis.php. PMID:24727366

  15. Bioinformatics Prediction of Polyketide Synthase Gene Clusters from Mycosphaerella fijiensis.

    Science.gov (United States)

    Noar, Roslyn D; Daub, Margaret E

    2016-01-01

    Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity) for six of the PKS sequences. One of the PKS sequences was not similar (banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that they may encode polyketides important in pathogenicity. PMID:27388157

  16. Bioinformatics Prediction of Polyketide Synthase Gene Clusters from Mycosphaerella fijiensis.

    Directory of Open Access Journals (Sweden)

    Roslyn D Noar

    Full Text Available Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity for six of the PKS sequences. One of the PKS sequences was not similar (< 60% similarity to sequences in any of the 103 genomes, suggesting that it encodes a unique compound. Comparison of the M. fijiensis PKS sequences with those of two other banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that

  17. Integrative Functional Genomics Analysis of Sustained Polyploidy Phenotypes in Breast Cancer Cells Identifies an Oncogenic Profile for GINS2

    Directory of Open Access Journals (Sweden)

    Juha K. Rantala

    2010-11-01

    Full Text Available Aneuploidy is among the most obvious differences between normal and cancer cells. However, mechanisms contributing to development and maintenance of aneuploid cell growth are diverse and incompletely understood. Functional genomics analyses have shown that aneuploidy in cancer cells is correlated with diffuse gene expression signatures and aneuploidy can arise by a variety of mechanisms, including cytokinesis failures, DNA endoreplication, and possibly through polyploid intermediate states. To identify molecular processes contributing to development of aneuploidy, we used a cell spot microarray technique to identify genes inducing polyploidy and/or allowing maintenance of polyploid cell growth in breast cancer cells. Of 5760 human genes screened, 177 were found to induce severe DNA content alterations on prolonged transient silencing. Association with response to DNA damage stimulus and DNA repair was found to be the most enriched cellular processes among the candidate genes. Functional validation analysis of these genes highlighted GINS2 as the highest ranking candidate inducing polyploidy, accumulation of endogenous DNA damage, and impairing cell proliferation on inhibition. The cell growth inhibition and induction of polyploidy by suppression of GINS2 was verified in a panel of breast cancer cell lines. Bioinformatic analysis of published gene expression and DNA copy number studies of clinical breast tumors suggested GINS2 to be associated with the aggressive characteristics of a subgroup of breast cancers in vivo. In addition, nuclear GINS2 protein levels distinguished actively proliferating cancer cells suggesting potential use of GINS2 staining as a biomarker of cell proliferation as well as a potential therapeutic target.

  18. Market Analysis Identifies Community and School Education Goals.

    Science.gov (United States)

    Lindle, Jane C.

    1989-01-01

    Principals must realize the positive effects that marketing can have on improving schools and building support for them. Market analysis forces clarification of the competing needs and interests present in the community. The four marketing phases are needs assessment, analysis, goal setting, and public relations and advertising. (MLH)

  19. Structural parameter identifiability analysis for dynamic reaction networks

    DEFF Research Database (Denmark)

    Davidescu, Florin Paul; Jørgensen, Sten Bay

    2008-01-01

    A fundamental problem in model identification is to investigate whether unknown parameters in a given model structure potentially can be uniquely recovered from experimental data. This issue of global or structural identifiability is essential during nonlinear first principles model development...... where for a given set of measured variables it is desirable to investigate which parameters may be estimated prior to spending computational effort on the actual estimation. This contribution addresses the structural parameter identifiability problem for the typical case of reaction network models. The...

  20. 9th International Conference on Practical Applications of Computational Biology and Bioinformatics

    CERN Document Server

    Rocha, Miguel; Fdez-Riverola, Florentino; Paz, Juan

    2015-01-01

    This proceedings presents recent practical applications of Computational Biology and  Bioinformatics. It contains the proceedings of the 9th International Conference on Practical Applications of Computational Biology & Bioinformatics held at University of Salamanca, Spain, at June 3rd-5th, 2015. The International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB) is an annual international meeting dedicated to emerging and challenging applied research in Bioinformatics and Computational Biology. Biological and biomedical research are increasingly driven by experimental techniques that challenge our ability to analyse, process and extract meaningful knowledge from the underlying data. The impressive capabilities of next generation sequencing technologies, together with novel and ever evolving distinct types of omics data technologies, have put an increasingly complex set of challenges for the growing fields of Bioinformatics and Computational Biology. The analysis o...

  1. Use of Photogrammetry and Biomechanical Gait analysis to Identify Individuals

    DEFF Research Database (Denmark)

    Larsen, Peter Kastmand; Simonsen, Erik Bruun; Lynnerup, Niels

    Photogrammetry and recognition of gait patterns are valuable tools to help identify perpetrators based on surveillance recordings. We have found that stature but only few other measures have a satisfying reproducibility for use in forensics. Several gait variables with high recognition rates were...

  2. Biology in 'silico': The Bioinformatics Revolution.

    Science.gov (United States)

    Bloom, Mark

    2001-01-01

    Explains the Human Genome Project (HGP) and efforts to sequence the human genome. Describes the role of bioinformatics in the project and considers it the genetics Swiss Army Knife, which has many different uses, for use in forensic science, medicine, agriculture, and environmental sciences. Discusses the use of bioinformatics in the high school…

  3. Fuzzy Logic in Medicine and Bioinformatics

    OpenAIRE

    Torres, Angela; Nieto, Juan J.

    2006-01-01

    The purpose of this paper is to present a general view of the current applications of fuzzy logic in medicine and bioinformatics. We particularly review the medical literature using fuzzy logic. We then recall the geometrical interpretation of fuzzy sets as points in a fuzzy hypercube and present two concrete illustrations in medicine (drug addictions) and in bioinformatics (comparison of genomes).

  4. Using "Arabidopsis" Genetic Sequences to Teach Bioinformatics

    Science.gov (United States)

    Zhang, Xiaorong

    2009-01-01

    This article describes a new approach to teaching bioinformatics using "Arabidopsis" genetic sequences. Several open-ended and inquiry-based laboratory exercises have been designed to help students grasp key concepts and gain practical skills in bioinformatics, using "Arabidopsis" leucine-rich repeat receptor-like kinase (LRR RLK) genetic…

  5. A Mathematical Optimization Problem in Bioinformatics

    Science.gov (United States)

    Heyer, Laurie J.

    2008-01-01

    This article describes the sequence alignment problem in bioinformatics. Through examples, we formulate sequence alignment as an optimization problem and show how to compute the optimal alignment with dynamic programming. The examples and sample exercises have been used by the author in a specialized course in bioinformatics, but could be adapted…

  6. Online Bioinformatics Tutorials | Office of Cancer Genomics

    Science.gov (United States)

    Bioinformatics is a scientific discipline that applies computer science and information technology to help understand biological processes. The NIH provides a list of free online bioinformatics tutorials, either generated by the NIH Library or other institutes, which includes introductory lectures and "how to" videos on using various tools.

  7. Rapid Development of Bioinformatics Education in China

    Science.gov (United States)

    Zhong, Yang; Zhang, Xiaoyan; Ma, Jian; Zhang, Liang

    2003-01-01

    As the Human Genome Project experiences remarkable success and a flood of biological data is produced, bioinformatics becomes a very "hot" cross-disciplinary field, yet experienced bioinformaticians are urgently needed worldwide. This paper summarises the rapid development of bioinformatics education in China, especially related undergraduate…

  8. Expression profile analysis of long noncoding RNA in HER-2-enriched subtype breast cancer by next-generation sequencing and bioinformatics

    Directory of Open Access Journals (Sweden)

    Yang F

    2016-02-01

    Full Text Available Fan Yang, Shixu Lyu, Siyang Dong, Yehuan Liu, Xiaohua Zhang, Ouchen Wang Department of Surgical Oncology, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, People’s Republic of China Background: Human epidermal growth factor receptor 2 (HER-2-enriched subtype breast cancer is associated with a more aggressive phenotype and shorter survival time. Long noncoding RNAs (LncRNAs have essential roles in tumorigenesis and occupy a central place in cancer progression. Notably, few studies have focused on the dysregulation of LncRNAs in the HER-2-enriched subtype breast cancer. In this study, we analyzed the expression profile of LncRNAs and mRNAs in this particular subtype of breast cancer. Methods: Seven pairs of HER-2-enriched subtype breast cancer and normal tissue were sequenced. We screened out differently expressed genes and measured the correlation of the expression levels of dysregulated LncRNAs and HER-2 by Pearson’s correlation coefficient analysis. Gene ontology analysis and pathway analysis were used to understand the biological roles of these differently expressed genes. Pathway act network and coexpression network were constructed. Results: More than 1,300 LncRNAs and 2,800 mRNAs, which were significantly differently expressed, were identified. Among these LncRNAs, AFAP1-AS1 was the most dysregulated LncRNA, while ORM2 was the most dysregulated mRNA. LOC100288637 had the highest positive correlation coefficient of 0.93 with HER-2, while RPL13P5 had the highest negative correlation coefficient of -0.87. The pathway act network showed that MAPK signaling pathway, PI3K-Akt signaling pathway, metabolic pathways, cell cycle, and regulation of actin cytoskeleton were highly related with HER-2-enriched subtype breast cancer. Coexpression network recognized LINC00636, LINC01405, ADARB2-AS1, ST8SIA6-AS1, LINC00511, and DPP10-AS1 as core genes. Conclusion: These results analyze the functions of LncRNAs and provide

  9. Combinational risk factors of metabolic syndrome identified by fuzzy neural network analysis of health-check data

    Directory of Open Access Journals (Sweden)

    Ushida Yasunori

    2012-08-01

    Full Text Available Abstract Background Lifestyle-related diseases represented by metabolic syndrome develop as results of complex interaction. By using health check-up data from two large studies collected during a long-term follow-up, we searched for risk factors associated with the development of metabolic syndrome. Methods In our original study, we selected 77 case subjects who developed metabolic syndrome during the follow-up and 152 healthy control subjects who were free of lifestyle-related risk components from among 1803 Japanese male employees. In a replication study, we selected 2196 case subjects and 2196 healthy control subjects from among 31343 other Japanese male employees. By means of a bioinformatics approach using a fuzzy neural network (FNN, we searched any significant combinations that are associated with MetS. To ensure that the risk combination selected by FNN analysis was statistically reliable, we performed logistic regression analysis including adjustment. Results We selected a combination of an elevated level of γ-glutamyltranspeptidase (γ-GTP and an elevated white blood cell (WBC count as the most significant combination of risk factors for the development of metabolic syndrome. The FNN also identified the same tendency in a replication study. The clinical characteristics of γ-GTP level and WBC count were statistically significant even after adjustment, confirming that the results obtained from the fuzzy neural network are reasonable. Correlation ratio showed that an elevated level of γ-GTP is associated with habitual drinking of alcohol and a high WBC count is associated with habitual smoking. Conclusions This result obtained by fuzzy neural network analysis of health check-up data from large long-term studies can be useful in providing a personalized novel diagnostic and therapeutic method involving the γ-GTP level and the WBC count.

  10. Rice transcriptome analysis to identify possible herbicide quinclorac detoxification genes

    OpenAIRE

    Xu, Wenying; Di, Chao; Zhou, Shaoxia; Liu, Jia; LI Li; Liu, Fengxia; Yang, Xinling; Ling, Yun; Su, Zhen

    2015-01-01

    Quinclorac is a highly selective auxin-type herbicide and is widely used in the effective control of barnyard grass in paddy rice fields, improving the world's rice yield. The herbicide mode of action of quinclorac has been proposed, and hormone interactions affecting quinclorac signaling has been identified. Because of widespread use, quinclorac may be transported outside rice fields with the drainage waters, leading to soil and water pollution and other environmental health problems. In thi...

  11. Association analysis identifies ZNF750 regulatory variants in psoriasis

    Directory of Open Access Journals (Sweden)

    Birnbaum Ramon Y

    2011-12-01

    Full Text Available Abstract Background Mutations in the ZNF750 promoter and coding regions have been previously associated with Mendelian forms of psoriasis and psoriasiform dermatitis. ZNF750 encodes a putative zinc finger transcription factor that is highly expressed in keratinocytes and represents a candidate psoriasis gene. Methods We examined whether ZNF750 variants were associated with psoriasis in a large case-control population. We sequenced the promoter and exon regions of ZNF750 in 716 Caucasian psoriasis cases and 397 Caucasian controls. Results We identified a total of 47 variants, including 38 rare variants of which 35 were novel. Association testing identified two ZNF750 haplotypes associated with psoriasis (p ZNF750 promoter and 5' UTR variants displayed a 35-55% reduction of ZNF750 promoter activity, consistent with the promoter activity reduction seen in a Mendelian psoriasis family with a ZNF750 promoter variant. However, the rare promoter and 5' UTR variants identified in this study did not strictly segregate with the psoriasis phenotype within families. Conclusions Two haplotypes of ZNF750 and rare 5' regulatory variants of ZNF750 were found to be associated with psoriasis. These rare 5' regulatory variants, though not causal, might serve as a genetic modifier of psoriasis.

  12. The Screening of Genes Sensitive to Long-Term, Low-Level Microwave Exposure and Bioinformatic Analysis of Potential Correlations to Learning and Memory

    Institute of Scientific and Technical Information of China (English)

    ZHAO Ya Li; LI Ying Xian; MA Hong Bo; LI Dong; LI Hai Liang; JIANG Rui; KAN Guang Han; YANG Zhen Zhong; HUANG Zeng Xin

    2015-01-01

    Objective To gain a better understanding of gene expression changes in the brain following microwave exposure in mice. This study hopes to reveal mechanisms contributing to microwave-induced learning and memory dysfunction. Methods Mice were exposed to whole body 2100 MHz microwaves with specific absorption rates (SARs) of 0.45 W/kg, 1.8 W/kg, and 3.6 W/kg for 1 hour daily for 8 weeks. Differentially expressing genes in the brains were screened using high-density oligonucleotide arrays, with genes showing more significant differences further confirmed by RT-PCR. Results The gene chip results demonstrated that 41 genes (0.45 W/kg group), 29 genes (1.8 W/kg group), and 219 genes (3.6 W/kg group) were differentially expressed. GO analysis revealed that these differentially expressed genes were primarily involved in metabolic processes, cellular metabolic processes, regulation of biological processes, macromolecular metabolic processes, biosynthetic processes, cellular protein metabolic processes, transport, developmental processes, cellular component organization, etc. KEGG pathway analysis showed that these genes are mainly involved in pathways related to ribosome, Alzheimer's disease, Parkinson's disease, long-term potentiation, Huntington's disease, and Neurotrophin signaling. Construction of a protein interaction network identified several important regulatory genes including synbindin (sbdn), Crystallin (CryaB), PPP1CA, Ywhaq, Psap, Psmb1, Pcbp2, etc., which play important roles in the processes of learning and memory. Conclusion Long-term, low-level microwave exposure may inhibit learning and memory by affecting protein and energy metabolic processes and signaling pathways relating to neurological functions or diseases.

  13. Incorporating Genomics and Bioinformatics across the Life Sciences Curriculum

    Energy Technology Data Exchange (ETDEWEB)

    Ditty, Jayna L.; Kvaal, Christopher A.; Goodner, Brad; Freyermuth, Sharyn K.; Bailey, Cheryl; Britton, Robert A.; Gordon, Stuart G.; Heinhorst, Sabine; Reed, Kelynne; Xu, Zhaohui; Sanders-Lorenz, Erin R.; Axen, Seth; Kim, Edwin; Johns, Mitrick; Scott, Kathleen; Kerfeld, Cheryl A.

    2011-08-01

    into courses or independent research projects requires infrastructure for organizing and assessing student work. Here, we present a new platform for faculty to keep current with the rapidly changing field of bioinformatics, the Integrated Microbial Genomes Annotation Collaboration Toolkit (IMG-ACT). It was developed by instructors from both research-intensive and predominately undergraduate institutions in collaboration with the Department of Energy-Joint Genome Institute (DOE-JGI) as a means to innovate and update undergraduate education and faculty development. The IMG-ACT program provides a cadre of tools, including access to a clearinghouse of genome sequences, bioinformatics databases, data storage, instructor course management, and student notebooks for organizing the results of their bioinformatic investigations. In the process, IMG-ACT makes it feasible to provide undergraduate research opportunities to a greater number and diversity of students, in contrast to the traditional mentor-to-student apprenticeship model for undergraduate research, which can be too expensive and time-consuming to provide for every undergraduate. The IMG-ACT serves as the hub for the network of faculty and students that use the system for microbial genome analysis. Open access of the IMG-ACT infrastructure to participating schools ensures that all types of higher education institutions can utilize it. With the infrastructure in place, faculty can focus their efforts on the pedagogy of bioinformatics, involvement of students in research, and use of this tool for their own research agenda. What the original faculty members of the IMG-ACT development team present here is an overview of how the IMG-ACT program has affected our development in terms of teaching and research with the hopes that it will inspire more faculty to get involved.

  14. Evolutionary and bioinformatic analysis of the spike glycoprotein gene of H120 vaccine strain protectotype of infectious bronchitis virus from India.

    Science.gov (United States)

    Kamble, Nitin Machindra; Pillai, Aravind S; Gaikwad, Satish S; Shukla, Sanjeev Kumar; Khulape, Sagar Aashok; Dey, Sohini; Mohan, C Madhan

    2016-01-01

    The infectious bronchitis virus is a causative agent of avian infectious bronchitis (AIB), and is is an important disease that produces severe economic losses to the poultry industry worldwide. Recent AIB outbreaks in India have been associated with poor growth in broilers, drop in egg production, and thin egg shells in layers. The complete spike gene of Indian AIB vaccine strain was amplified and sequenced using a conventional reverse transcription polymerase chain reaction and is submitted to the GenBank (accession no KF188436). Phylogenetic analysis revealed that the vaccine strain currently used belongs to H120 genotype, an attenuated strain of Massachusetts (Mass) serotype. Nucleotide and amino acid sequence comparisons have shown that the reported spike gene from Indian isolates have 71.8%-99% and 71.4%-96.9% genetic similarity with the sequenced H120 strain. The study identifies live attenuated IBV vaccine strain, which is routinely used for vaccination, for the first time. Based on nucleotide and amino acid relatedness studies of the vaccine strain with reported IBV sequences from India, it is shown that the current vaccine strain is efficient in controlling the IBV infection. Continuous monitoring of IBV outbreaks by sequencing for genotyping and in vivo cross protection studies for serotyping is not only important for epidemiological investigation but also for evaluation of efficacy of the current vaccine. PMID:25311758

  15. Integrating in silico and in vitro analysis of peptide binding affinity to HLA-Cw*0102: a bioinformatic approach to the prediction of new epitopes.

    Directory of Open Access Journals (Sweden)

    Valerie A Walshe

    Full Text Available BACKGROUND: Predictive models of peptide-Major Histocompatibility Complex (MHC binding affinity are important components of modern computational immunovaccinology. Here, we describe the development and deployment of a reliable peptide-binding prediction method for a previously poorly-characterized human MHC class I allele, HLA-Cw*0102. METHODOLOGY/FINDINGS: Using an in-house, flow cytometry-based MHC stabilization assay we generated novel peptide binding data, from which we derived a precise two-dimensional quantitative structure-activity relationship (2D-QSAR binding model. This allowed us to explore the peptide specificity of HLA-Cw*0102 molecule in detail. We used this model to design peptides optimized for HLA-Cw*0102-binding. Experimental analysis showed these peptides to have high binding affinities for the HLA-Cw*0102 molecule. As a functional validation of our approach, we also predicted HLA-Cw*0102-binding peptides within the HIV-1 genome, identifying a set of potent binding peptides. The most affine of these binding peptides was subsequently determined to be an epitope recognized in a subset of HLA-Cw*0102-positive individuals chronically infected with HIV-1. CONCLUSIONS/SIGNIFICANCE: A functionally-validated in silico-in vitro approach to the reliable and efficient prediction of peptide binding to a previously uncharacterized human MHC allele HLA-Cw*0102 was developed. This technique is generally applicable to all T cell epitope identification problems in immunology and vaccinology.

  16. 藏绵羊脂蛋白脂酶基因克隆及序列分析%Tibetan Sheep LPL Gene Clone and Bioinformatic Analysis

    Institute of Scientific and Technical Information of China (English)

    高思; 徐亚欧; 毛亮; 邵欢欢; 杨虎林; 舒浩国

    2011-01-01

    [目的]为深入研究藏绵羊肉用性能的遗传调控与营养代谢关系.[方法]利用RT-PCR和T-A克隆技术获得了藏绵羊LPL基因,并对其进行生物信息学分析.[结果]藏绵羊LPL编码基因全长1437 bp,编码478个氨基酸.将藏绵羊LPL基因及氨基酸序列分别与GenBank中公布的11种动物进行序列一致率比对,发现藏绵羊与所选动物的LPL基因序列一致率在84.6%-99.6%,LPL氨基酸序列一致率在88.8%-99.0%.藏绵羊与普通绵羊LPL基因存在6个位点核苷酸差异,其中有一个核苷酸位点的差异没有引起相应氨基酸的改变,其余5个住点核苷酸的不同都引起了氨基酸的差异.[结论]该研究可为了解LPL基因的演化关系及作用机理提供资料.%[ Objective ] The aim was to deeply study the relationship between the genetic regulation of meat performance of Tibetan sheep and nutrition and metabolism. [ Method ] The LPL coding gene of Tibetan sheep was cloned by reverse-translation PCR and T-A clone technology,then it was analyzed by Bioinformatics software. [ Result] The results showed that LPL gene of Tibetan sheep contained 1437 bp nucleotides and encoded 478 amino acids. The multiple sequence alignment such as Tibetan sheep, sheep, goat, cattle, yak, pig, dog, cat, baboon, orangutan, human, Norway rat and rattus showed that the total homologous rate of LPL gene was 84.6% - 99.6%, and the homologous rate of amino acids was 88.8% ~ 99.0%. Moreover,6 different nucleotides were foumd between Tibetan sheep and common sheep. One of these nucleotide was synonymous codon so that the amino acid which the synonymous codon encoded was identical between Tibetan sheep and common sheep,and the other five nucleotides which encoded different amino acids between Tibetan sheep and common sheep. [ Conclusion ] The study can provide reference for knowing the evolution relation of LPL gene and its mechanism of action.

  17. Evaluation of energy system analysis techniques for identifying underground facilities

    Energy Technology Data Exchange (ETDEWEB)

    VanKuiken, J.C.; Kavicky, J.A.; Portante, E.C. [and others

    1996-03-01

    This report describes the results of a study to determine the feasibility and potential usefulness of applying energy system analysis techniques to help detect and characterize underground facilities that could be used for clandestine activities. Four off-the-shelf energy system modeling tools were considered: (1) ENPEP (Energy and Power Evaluation Program) - a total energy system supply/demand model, (2) ICARUS (Investigation of Costs and Reliability in Utility Systems) - an electric utility system dispatching (or production cost and reliability) model, (3) SMN (Spot Market Network) - an aggregate electric power transmission network model, and (4) PECO/LF (Philadelphia Electric Company/Load Flow) - a detailed electricity load flow model. For the purposes of most of this work, underground facilities were assumed to consume about 500 kW to 3 MW of electricity. For some of the work, facilities as large as 10-20 MW were considered. The analysis of each model was conducted in three stages: data evaluation, base-case analysis, and comparative case analysis. For ENPEP and ICARUS, open source data from Pakistan were used for the evaluations. For SMN and PECO/LF, the country data were not readily available, so data for the state of Arizona were used to test the general concept.

  18. Identifying Colluvial Slopes by Airborne LiDAR Analysis

    Science.gov (United States)

    Kasai, M.; Marutani, T.; Yoshida, H.

    2015-12-01

    Colluvial slopes are one of major sources of landslides. Identifying the locations of the slopes will help reduce the risk of disasters, by avoiding building infrastructure and properties nearby, or if they are already there, by applying appropriate counter measures before it suddenly moves. In this study, airborne LiDAR data was analyzed to find their geomorphic characteristics to use for extracting their locations. The study site was set in the suburb of Sapporo City, Hokkaido in Japan. The area is underlain by Andesite and Tuff and prone to landslides. Slope angle and surface roughness were calculated from 5 m resolution DEM. These filters were chosen because colluvial materials deposit at around the angle of repose and accumulation of loose materials was considered to form a peculiar surface texture differentiable from other slope types. Field survey conducted together suggested that colluvial slopes could be identified by the filters with a probability of 80 percent. Repeat LiDAR monitoring of the site by an unmanned helicopter indicated that those slopes detected as colluviums appeared to be moving at a slow rate. In comparison with a similar study from the crushed zone in Japan, the range of slope angle indicative of colluviums agreed with the Sapporo site, while the texture was rougher due to larger debris composing the slopes.

  19. Bioinformatic identification of novel putative photoreceptor specific cis-elements

    OpenAIRE

    Knox Barry E; Qin Maochun; McIlvain Vera A; Danko Charles G; Pertsov Arkady M

    2007-01-01

    Abstract Background Cell specific gene expression is largely regulated by different combinations of transcription factors that bind cis-elements in the upstream promoter sequence. However, experimental detection of cis-elements is difficult, expensive, and time-consuming. This provides a motivation for developing bioinformatic methods to identify cis-elements that could prioritize future experimental studies. Here, we use motif discovery algorithms to predict transcription factor binding site...

  20. Temperature-based Instanton Analysis: Identifying Vulnerability in Transmission Networks

    Energy Technology Data Exchange (ETDEWEB)

    Kersulis, Jonas [Univ. of Michigan, Ann Arbor, MI (United States); Hiskens, Ian [Univ. of Michigan, Ann Arbor, MI (United States); Chertkov, Michael [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Backhaus, Scott N. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Bienstock, Daniel [Columbia Univ., New York, NY (United States)

    2015-04-08

    A time-coupled instanton method for characterizing transmission network vulnerability to wind generation fluctuation is presented. To extend prior instanton work to multiple-time-step analysis, line constraints are specified in terms of temperature rather than current. An optimization formulation is developed to express the minimum wind forecast deviation such that at least one line is driven to its thermal limit. Results are shown for an IEEE RTS-96 system with several wind-farms.

  1. Efficacy of fractal analysis in identifying glaucomatous damage

    Science.gov (United States)

    Kim, P. Y.; Iftekharuddin, K. M.; Gunvant, P.; Tóth, M.; Holló, G.; Essock, E. A.

    2010-02-01

    In this work, we propose a novel fractal-based technique to analyze pseudo 2D representation of 1D retinal nerve fiber layer (RNFL) thickness measurement data vector set for early detection of glaucoma. In our proposed technique, we first convert the 1D RNFL data vector sets into pseudo 2D images and then exploit 2D fractal analysis (FA) technique to obtain the representative features. These 2D fractal-based features are further processed using principal component analysis (PCA) and the final classification between normal and glaucomatous eyes is obtained using Fischer's linear discriminant analysis (LDA). An independent dataset is used for training and testing the classifier. The technique is used on randomly selected GDx variable corneal compensator (VCC) eye data from 227 study participants (116 patients with glaucoma and 111 patients with healthy eyes). We compute sensitivity, specificity and area under receiver operating curve (AUROC) for statistical performance comparison with other known techniques. Our classification performance shows that fractal-based technique is superior to the standard machine classifier Nerve Fiber Indicator (NFI).

  2. 萱草microRNAs生物信息学及与冷冻相关microRNAs的分析%Bioinformatics, Expression and Functional Analysis of microRNAs in Response to Low Temperature in Hemerocallis fulva (L.) L.

    Institute of Scientific and Technical Information of China (English)

    安凤霞; 卢宝伟; 梁鸣; 唐焕伟; 李富恒

    2014-01-01

    MicroRNAs (miRNAs), as endogenous small non-coding single-stranded RNAs of 16-29 nt, play a prominent role in the process of growth, development and responses to environmental stresses in plants. The miRNAs in response to low temperature in Hemerocallis fulva roots were identified using deep-sequencing technique in combination with bioinformatics prediction. A total of 14 843 184 and 16 072 575 RNA sequences were explored under normal and low temperature conditions, which represented 14 064 385 and 15 309 725 types of small RNA (sRNA), respectively. The sRNA showed a normal distribution. Through GenBank and Rfam comparison analysis, rRNA and tRNA accounts for a larger proportion in non-coding RNA. Totally 799 994 sRNA in 67 411 types were annotated under low temperature, and 1 055 466 sRNAs in 66 524 types were annotated under normal temperature. miR393, miR397 and miR396 were up-regulated and miR319 was down-regulated at low temperature. This research provides rich data for illuminating the regulatory mechanism of protein synthesis and screening the key regulatory genes in response to low temperature.%microRNA是一类长度为16~29 nt的非蛋白质编码的内源小分子RNA (sRNA),在植物生长发育以及逆境胁迫响应等过程中发挥着重要作用。本文利用基于HiSeq原理的sRNA深度测序技术,结合生物信息学方法对萱草根系中已知miRNA的类型、丰度以及部分与冷冻胁迫相关的已知miRNA的功能进行了分析。结果表明,在10℃常温和-25℃低温条件下萱草根系中分别有14843184和16072575条序列信息,代表14064385和15309725种sRNA片段,且sRNA均呈现正态分布特征;在非编码RNA中转运RNA (tRNA)、核糖体RNA (rRNA)所占比例较大。低温sRNA组中得到注释的sRNA有67411种,共计799994条sRNA片段;常温sRNA组中,得到注释的sRNA有66524种,共计1055466条sRNA片段。冷冻胁迫下,萱草通过提高miR393、miR397、miR396的表达量

  3. Predicting missing links and identifying spurious links via likelihood analysis.

    Science.gov (United States)

    Pan, Liming; Zhou, Tao; Lü, Linyuan; Hu, Chin-Kun

    2016-01-01

    Real network data is often incomplete and noisy, where link prediction algorithms and spurious link identification algorithms can be applied. Thus far, it lacks a general method to transform network organizing mechanisms to link prediction algorithms. Here we use an algorithmic framework where a network's probability is calculated according to a predefined structural Hamiltonian that takes into account the network organizing principles, and a non-observed link is scored by the conditional probability of adding the link to the observed network. Extensive numerical simulations show that the proposed algorithm has remarkably higher accuracy than the state-of-the-art methods in uncovering missing links and identifying spurious links in many complex biological and social networks. Such method also finds applications in exploring the underlying network evolutionary mechanisms. PMID:26961965

  4. Statistical modelling in biostatistics and bioinformatics selected papers

    CERN Document Server

    Peng, Defen

    2014-01-01

    This book presents selected papers on statistical model development related mainly to the fields of Biostatistics and Bioinformatics. The coverage of the material falls squarely into the following categories: (a) Survival analysis and multivariate survival analysis, (b) Time series and longitudinal data analysis, (c) Statistical model development and (d) Applied statistical modelling. Innovations in statistical modelling are presented throughout each of the four areas, with some intriguing new ideas on hierarchical generalized non-linear models and on frailty models with structural dispersion, just to mention two examples. The contributors include distinguished international statisticians such as Philip Hougaard, John Hinde, Il Do Ha, Roger Payne and Alessandra Durio, among others, as well as promising newcomers. Some of the contributions have come from researchers working in the BIO-SI research programme on Biostatistics and Bioinformatics, centred on the Universities of Limerick and Galway in Ireland and fu...

  5. Protein expression and bioinformatics analysis of stk40 gene related to embryo development%胚胎发育相关基因stk40的蛋白表达和生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    崔骥; 张军强; 陈洁; 朱丹丹; 郭锡熔; 童国庆

    2011-01-01

    Objective To detcet the protein expression of stk40 and analyze the bioinformatics of embryo development-related gene.Methods Western blot was performed to detcet the protein expression of stk40 in the early developmental embryos of mouse.An initial bioinformatics analysis was performed on its gene structure, genome localization, the physical and chemical characteristics of its coding protein, secondary structure, hydrophobicity/hydrophilicity, structural domain and so on.Results It was demonstrated that the protein expression of stk40 was lower in the developmentarrested 8-cell embryos than that in the normal ones.Bioinformatics analysis showed that stk40 gene was a 3877 bp mRNA,containing 1350 nucleotides of an open reading frame predicting 449 amino acids with a molecular mass of 50563.9.NCBI Map Viewer analysis revealed that the stk40 gene was located on chromosome 4D2.2 and was composed of 13 exons and 12 introns.The stk40 had a STYKc domain related to emergence of cellular organisms.Conclusion The detcetion and analysis of stk40 gene may provide foundation and novel information for the further study.%目的 检测胚胎发育相关基因 stk40 的表达,并对其进行生物信息学分析.方法 取小鼠早期各发育阶段的胚胎样本,用Western blot方法检测stk40的蛋白表达.用生物信息学软件或数据库分析预测stk40基因及其编码蛋白的基因结构、染色体定位、蛋白质理化性质、二级结构、疏水性/亲水性及结构域.结果 证实stk40蛋白在小鼠8细胞发育阻滞胚胎中的表达显著低于发育正常的早期胚胎.生物信息学分析显示,stk40基因mRNA全长3877 bp,开放阅读框长1350 bp,编码449个氨基酸,相对分子质量50563.9;定位于染色体4D2.2区域,含13个外显子和12个内含子.蛋白结构域分析提示,stk40基因编码蛋白存在-STYKc结构域,可能与细胞有机体的发生有关.结论 stk40基因编码蛋白的成功检测及生物信息学分析为进一

  6. Multi-Institutional FASTQ File Exchange as a Means of Proficiency Testing for Next-Generation Sequencing Bioinformatics and Variant Interpretation.

    Science.gov (United States)

    Davies, Kurtis D; Farooqi, Midhat S; Gruidl, Mike; Hill, Charles E; Woolworth-Hirschhorn, Julie; Jones, Heather; Jones, Kenneth L; Magliocco, Anthony; Mitui, Midori; O'Neill, Philip H; O'Rourke, Rebecca; Patel, Nirali M; Qin, Dahui; Ramos, Erica; Rossi, Michael R; Schneider, Thomas M; Smith, Geoffrey H; Zhang, Linsheng; Park, Jason Y; Aisner, Dara L

    2016-07-01

    Next-generation sequencing is becoming increasingly common in clinical laboratories worldwide and is revolutionizing clinical molecular testing. However, the large amounts of raw data produced by next-generation sequencing assays and the need for complex bioinformatics analyses present unique challenges. Proficiency testing in clinical laboratories has traditionally been designed to evaluate assays in their entirety; however, it can be alternatively applied to separate assay components. We developed and implemented a multi-institutional proficiency testing approach to directly assess custom bioinformatics and variant interpretation processes. Six clinical laboratories, all of which use the same commercial library preparation kit for next-generation sequencing analysis of tumor specimens, each submitted raw data (FASTQ files) from four samples. These 24 file sets were then deidentified and redistributed to five of the institutions for analysis and interpretation according to their clinically validated approach. Among the laboratories, there was a high rate of concordance in the calling of single-nucleotide variants, in particular those we considered clinically significant (100% concordance). However, there was significant discordance in the calling of clinically significant insertions/deletions, with only two of seven being called by all participating laboratories. Missed calls were addressed by each laboratory to improve their bioinformatics processes. Thus, through our alternative proficiency testing approach, we identified the bioinformatic detection of insertions/deletions as an area of particular concern for clinical laboratories performing next-generation sequencing testing. PMID:27155050

  7. Computational biology and bioinformatics in Nigeria.

    Directory of Open Access Journals (Sweden)

    Segun A Fatumo

    2014-04-01

    Full Text Available Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries.

  8. BioWarehouse: a bioinformatics database warehouse toolkit

    Directory of Open Access Journals (Sweden)

    Stringer-Calvert David WJ

    2006-03-01

    Full Text Available Abstract Background This article addresses the problem of interoperation of heterogeneous bioinformatics databases. Results We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. Conclusion BioWarehouse embodies significant progress on the

  9. A Bioinformatic Analysis on Caffeine Synthase in Plants%植物咖啡碱合成酶的生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    孔祥瑞; 杨军; 王让剑

    2014-01-01

    The amino acid sequences of caffeine synthase from Camellia sinensis ,Theobroma cacao ,Camellia japonica and other plants which were registered in GenBank,were analyzed and predicted by bioinformatic tools in subsequent aspects, including the isoelectric point, subcellular localization, signal peptide, transmembrane topologieal structure,conserved functional domain,motif,secondary structure and tertiary structure of protein. Results showed that the caffeine synthase of plants which were located in cytoplasm and nuclei, and had phosphorylation,acylation,glycosylation sites could be divided into three different types based on gene sequences and conservative domains.Two of them,type I and type II protein,were α-type soluble proteinases,and the secondary structure of type III proteinase was rich in random coil and has potential signal peptide,but they all did not have transmembrane helical structure.The result of tertiary structure prediction indicated that type I protein and type II protein were similar,they were all composed of α-helix and horizontal β-folded layers,but in the type III protein the α-helixes locateed in the lateral ends and were connected by vertical β-folded layers.%采用生物信息学分析方法对 GenBank 中来源于茶树、可可、山茶等植物咖啡碱合成酶的氨基酸序列进行比对分析,就等电点、亚细胞定位、信号肽、跨膜螺旋、保守性功能结构域及基序、二级结构与三级结构等重要参数进行预测与分析。结果表明,植物咖啡碱合成酶主要定位于胞质和胞核中,含有磷酸化、酰基化和糖基化修饰位点,基于基因序列与保守结构域可被分成3种类型,其中 I 型与 II 型酶蛋白均属全α型水溶性酶蛋白,III 型酶蛋白除二级结构富含无规卷曲构件,还极有可能存在信号肽序列,但3类酶蛋白均无跨膜螺旋,三级结构预测显示,I 型、II 型酶蛋白极为相似,由α螺旋和横

  10. Bioinformatic analysis of regulation of microRNA on target genes in pediatric asthma%microRNA对儿童哮喘靶基因调控的生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    董晓艳; 陆权; 张慧燕; 顾坚磊; 钟南

    2016-01-01

    目的:应用生物信息学技术分析尘螨过敏哮喘儿童特异性microRNA(miRNA)及其靶基因筛选,探讨哮喘发病机制。方法采用病例对照研究,在62对尘螨过敏哮喘患儿及同龄正常无过敏儿童中,随机选取12例哮喘患儿及对照者进行microRNA芯片分析,比较两组中存在异常表达的miRNAs,并在其余病例中进行RT-qPCR验证和生物信息学分析。结果尘螨过敏哮喘儿童中有6个microRNA表达较对照组下调2倍以上,分别为miRNA-151a-5p、625-5p、126-3p、513a-5p、27b-3p、22-3p,差异均有统计学意义(P<0.05)。进一步的生物信息学富集分析发现,这些microRNAs调控的PPARGC1B、CBL、ONECUT2、ESR1、EGFR、SYK、STAT1与炎症因子信号通路有着显著性关联(P<0.05)。结论 miRNA-625-5p、513a-5p、27b-3p、22-3p可能通过共同调控相关靶基因,形成一个网络通路,参与尘螨诱发儿童哮喘的发生。%Objective To understand the underlying mechanism of mites-induced pediatric asthma by bioinformatic analysis on speciifc microRNA (miRNA) array and target gene screening. Methods This is a case control study of 62 pairs of dust mites-induced asthma children with age and gender matched healthy controls. Twelve pairs were randomly selected for miRNA array. The abnormal expression of miRNAs was compared between asthma and control children. The results were validated by RT-qPCR and bioinformatic analysis in remaining pairs of children. Results Six miRNAs (miRNA-151a-5p, 625-5p, 126-3p, 513a-5p, 27b-3p, 22-3p) were signiifcantly down-regulated more than two folds in dust mites-induced asthma children than those in controls. The enriched bioinformatics analysis showed that these miRNAs and their target genes CBL, PPARGC1B, ESR1, ONECUT2, EGFR, SYK, and STAT1 were related to inlfammatory cytokine signaling pathway. Conclusion It is suggested that miR-22-3p, 513a-5p, 625-5p, 27b-3p, and miRNA-target genes form a network

  11. Differential Expression of Proteins Associated with the Hair Follicle Cycle - Proteomics and Bioinformatics Analyses.

    Directory of Open Access Journals (Sweden)

    Lei Wang

    Full Text Available Hair follicle cycling can be divided into the following three stages: anagen, catagen, and telogen. The molecular signals that orchestrate the follicular transition between phases are still unknown. To better understand the detailed protein networks controlling this process, proteomics and bioinformatics analyses were performed to construct comparative protein profiles of mouse skin at specific time points (0, 8, and 20 days. Ninety-five differentially expressed protein spots were identified by MALDI-TOF/TOF as 44 proteins, which were found to change during hair follicle cycle transition. Proteomics analysis revealed that these changes in protein expression are involved in Ca2+-regulated biological processes, migration, and regulation of signal transduction, among other processes. Subsequently, three proteins were selected to validate the reliability of expression patterns using western blotting. Cluster analysis revealed three expression patterns, and each pattern correlated with specific cell processes that occur during the hair cycle. Furthermore, bioinformatics analysis indicated that the differentially expressed proteins impacted multiple biological networks, after which detailed functional analyses were performed. Taken together, the above data may provide insight into the three stages of mouse hair follicle morphogenesis and provide a solid basis for potential therapeutic molecular targets for this hair disease.

  12. Network stratification analysis for identifying function-specific network layers.

    Science.gov (United States)

    Zhang, Chuanchao; Wang, Jiguang; Zhang, Chao; Liu, Juan; Xu, Dong; Chen, Luonan

    2016-04-22

    A major challenge of systems biology is to capture the rewiring of biological functions (e.g. signaling pathways) in a molecular network. To address this problem, we proposed a novel computational framework, namely network stratification analysis (NetSA), to stratify the whole biological network into various function-specific network layers corresponding to particular functions (e.g. KEGG pathways), which transform the network analysis from the gene level to the functional level by integrating expression data, the gene/protein network and gene ontology information altogether. The application of NetSA in yeast and its comparison with a traditional network-partition both suggest that NetSA can more effectively reveal functional implications of network rewiring and extract significant phenotype-related biological processes. Furthermore, for time-series or stage-wise data, the function-specific network layer obtained by NetSA is also shown to be able to characterize the disease progression in a dynamic manner. In particular, when applying NetSA to hepatocellular carcinoma and type 1 diabetes, we can derive functional spectra regarding the progression of the disease, and capture active biological functions (i.e. active pathways) in different disease stages. The additional comparison between NetSA and SPIA illustrates again that NetSA could discover more complete biological functions during disease progression. Overall, NetSA provides a general framework to stratify a network into various layers of function-specific sub-networks, which can not only analyze a biological network on the functional level but also investigate gene rewiring patterns in biological processes. PMID:26879865

  13. Potential of isotope analysis (C, Cl) to identify dechlorination mechanisms

    Science.gov (United States)

    Cretnik, Stefan; Thoreson, Kristen; Bernstein, Anat; Ebert, Karin; Buchner, Daniel; Laskov, Christine; Haderlein, Stefan; Shouakar-Stash, Orfan; Kliegman, Sarah; McNeill, Kristopher; Elsner, Martin

    2013-04-01

    Chloroethenes are commonly used in industrial applications, and detected as carcinogenic contaminants in the environment. Their dehalogenation is of environmental importance in remediation processes. However, a detailed understanding frequently accounted problem is the accumulation of toxic degradation products such as cis-dichloroethylene (cis-DCE) at contaminated sites. Several studies have addressed the reductive dehalogenation reactions using biotic and abiotic model systems, but a crucial question in this context has remained open: Do environmental transformations occur by the same mechanism as in their corresponding in vitro model systems? The presented study shows the potential to close this research gap using the latest developments in compound specific chlorine isotope analysis, which make it possible to routinely measure chlorine isotope fractionation of chloroethenes in environmental samples and complex reaction mixtures.1,2 In particular, such chlorine isotope analysis enables the measurement of isotope fractionation for two elements (i.e., C and Cl) in chloroethenes. When isotope values of both elements are plotted against each other, different slopes reflect different underlying mechanisms and are remarkably insensitive towards masking. Our results suggest that different microbial strains (G. lovleyi strain SZ, D. hafniense Y51) and the isolated cofactor cobalamin employ similar mechanisms of reductive dechlorination of TCE. In contrast, evidence for a different mechanism was obtained with cobaloxime cautioning its use as a model for biodegradation. The study shows the potential of the dual isotope approach as a tool to directly compare transformation mechanisms of environmental scenarios, biotic transformations, and their putative chemical lab scale systems. Furthermore, it serves as an essential reference when using the dual isotope approach to assess the fate of chlorinated compounds in the environment.

  14. Translational Bioinformatics and Clinical Research (Biomedical) Informatics.

    Science.gov (United States)

    Sirintrapun, S Joseph; Zehir, Ahmet; Syed, Aijazuddin; Gao, JianJiong; Schultz, Nikolaus; Cheng, Donavan T

    2016-03-01

    Translational bioinformatics and clinical research (biomedical) informatics are the primary domains related to informatics activities that support translational research. Translational bioinformatics focuses on computational techniques in genetics, molecular biology, and systems biology. Clinical research (biomedical) informatics involves the use of informatics in discovery and management of new knowledge relating to health and disease. This article details 3 projects that are hybrid applications of translational bioinformatics and clinical research (biomedical) informatics: The Cancer Genome Atlas, the cBioPortal for Cancer Genomics, and the Memorial Sloan Kettering Cancer Center clinical variants and results database, all designed to facilitate insights into cancer biology and clinical/therapeutic correlations. PMID:26851671

  15. Analysis of microarray-identified genes and microRNAs associated with drug resistance in ovarian cancer.

    Science.gov (United States)

    Zou, Jing; Yin, Fuqiang; Wang, Qi; Zhang, Wei; Li, Li

    2015-01-01

    The aim of this study was to identify potential microRNAs and genes associated with drug resistance in ovarian cancer through web-available microarrays. The drug resistant-related microRNA microarray dataset GS54665 and mRNA dataset GSE33482, GSE28646, and GSE15372 were downloaded from the Gene Expression Omnibus database. Dysregulated microRNAs/genes were screened with GEO2R and were further identified in SKOV3 (SKOV3/DDP) and A2780 (A2780/DDP) cells by real-time quantitative PCR (qRT-PCR), and then their associations with drug resistance was analyzed by comprehensive bioinformatic analyses. Nine microRNAs (microRNA-199a-5p, microRNA-199a-3p, microRNA-199b-3p, microRNA-215, microRNA-335, microRNA-18b, microRNA-363, microRNA-645 and microRNA-141) and 38 genes were identified to be differentially expressed in drug-resistant ovarian cancer cells, with seven genes (NHSL1, EPHA3, USP51, ZSCAN4, EPHA7, SNCA and PI15) exhibited exactly the same expression trends in all three microarrays. Biological process annotation and pathway enrichment analysis of the 9 microRNAs and 38 genes identified several drug resistant-related signaling pathways, and the microRNA-mRNA interaction revealed the existence of a targeted regulatory relationship between the 9 microRNAs and most of the 38 genes. The expression of 9 microRNAs and the 7 genes by qRT-PCR in SKOV3/DDP and A2780/DDP cells indicating a consistent expression profile with the microarrays. Among those, the expression of EPHA7 and PI15 were negatively correlated with that of microRNA-141, and they were also identified as potential targets of this microRNA via microRNA-mRNA interaction. We thus concluded that microRNA-141, EPHA7, and PI15 might jointly participate in the regulation of drug resistance in ovarian cancer and serve as potential targets in targeted therapies. PMID:26261572

  16. Forensic Bioinformatics: An innovative technological advancement in the field of Forensic Medicine and Diagnosis

    OpenAIRE

    Kumar Ajay; Singh Neetu; Gaurav S.S

    2012-01-01

    Background: The role of Bioinformatics in this modern age of technology advancement can not be over-emphasized. Aim: This study reviews the principle, techniques, and applications of Forensic Bioinformatics. Methods and Materials: Literature searches were done to identify relevant studies. Results: The concepts of sequence annotation and whole genome sequencing were possible due to the assimilation of software based tools which are exclusively responsible for the segregation of bulk genomic d...

  17. [Research of Identify Spatial Object Using Spectrum Analysis Technique].

    Science.gov (United States)

    Song, Wei; Feng, Shi-qi; Shi, Jing; Xu, Rong; Wang, Gong-chang; Li, Bin-yu; Liu, Yu; Li, Shuang; Cao Rui; Cai, Hong-xing; Zhang, Xi-he; Tan, Yong

    2015-06-01

    The high precision scattering spectrum of spatial fragment with the minimum brightness of 4.2 and the resolution of 0.5 nm has been observed using spectrum detection technology on the ground. The obvious differences for different types of objects are obtained by the normalizing and discrete rate analysis of the spectral data. Each of normalized multi-frame scattering spectral line shape for rocket debris is identical. However, that is different for lapsed satellites. The discrete rate of the single frame spectrum of normalized space debris for rocket debris ranges from 0.978% to 3.067%, and the difference of oscillation and average value is small. The discrete rate for lapsed satellites ranges from 3.118 4% to 19.472 7%, and the difference of oscillation and average value relatively large. The reason is that the composition of rocket debris is single, while that of the lapsed satellites is complex. Therefore, the spectrum detection technology on the ground can be used to the classification of the spatial fragment. PMID:26601348

  18. Identifying a preservation zone using multi–criteria decision analysis

    Directory of Open Access Journals (Sweden)

    Farashi, A.

    2016-03-01

    Full Text Available Zoning of a protected area is an approach to partition landscape into various land use units. The management of these landscape units can reduce conflicts caused by human activities. Tandoreh National Park is one of the most biologically diverse, protected areas in Iran. Although the area is generally designed to protect biodiversity, there are many conflicts between biodiversity conservation and human activities. For instance, the area is highly controversial and has been considered as an impediment to local economic development, such as tourism, grazing, road construction, and cultivation. In order to reduce human conflicts with biodiversity conservation in Tandoreh National Park, safe zones need to be established and human activities need to be moved out of the zones. In this study we used a systematic methodology to integrate a participatory process with Geographic Information Systems (GIS using a multi–criteria decision analysis (MCDA technique to guide a zoning scheme for the Tandoreh National Park, Iran. Our results show that the northern and eastern parts of the Tandoreh National Park that were close to rural areas and farmlands returned less desirability for selection as a preservation area. Rocky Mountains were the most important and most destructed areas and abandoned plains were the least important criteria for preservation in the area. Furthermore, the results reveal that the land properties were considered to be important for protection based on the obtaine

  19. Structural and bioinformatic analysis of the kiwifruit allergen Act d 11, a member of the family of ripening-related proteins.

    Science.gov (United States)

    Chruszcz, Maksymilian; Ciardiello, Maria Antonietta; Osinski, Tomasz; Majorek, Karolina A; Giangrieco, Ivana; Font, Jose; Breiteneder, Heimo; Thalassinos, Konstantinos; Minor, Wladek

    2013-12-01

    The allergen Act d 11, also known as kirola, is a 17 kDa protein expressed in large amounts in ripe green and yellow-fleshed kiwifruit. Ten percent of all kiwifruit-allergic individuals produce IgE specific for the protein. Using X-ray crystallography, we determined the first three-dimensional structures of Act d 11, produced from both recombinant expression in Escherichia coli and from the natural source (kiwifruit). While Act d 11 is immunologically correlated with the birch pollen allergen Bet v 1 and other members of the pathogenesis-related protein family 10 (PR-10), it has low sequence similarity to PR-10 proteins. By sequence Act d 11 appears instead to belong to the major latex/ripening-related (MLP/RRP) family, but analysis of the crystal structures shows that Act d 11 has a fold very similar to that of Bet v 1 and other PR-10 related allergens regardless of the low sequence identity. The structures of both the natural and recombinant protein include an unidentified ligand, which is relatively small (about 250 Da by mass spectrometry experiments) and most likely contains an aromatic ring. The ligand-binding cavity in Act d 11 is also significantly smaller than those in PR-10 proteins. The binding of the ligand, which we were not able to unambiguously identify, results in conformational changes in the protein that may have physiological and immunological implications. Interestingly, the residue corresponding to Glu45 in Bet v 1 (Glu46), which is important for IgE binding to the birch pollen allergen, is conserved in Act d 11, even though it is not in other allergens with significantly higher sequence identity to Bet v 1. We suggest that the so-called Gly-rich loop (or P-loop), which is conserved in all PR-10 allergens, may be responsible for IgE cross-reactivity between Bet v 1 and Act d 11. PMID:23969108

  20. Bioinformatics Analysis of Glutathione S-transferase Gene of Taenia saginata%牛带绦虫成虫谷胱甘肽S-转移酶基因的生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    王宇; 黄江; 戴佳琳; 廖兴江

    2012-01-01

    Objective: To analyze gene structure of glutathione S-transferase (GST) of Taenia sagi-nata, and to predict the structure and function of its encoded protein. Methods: Bioinformatics analy-sis tools in bioinformatics webs such as NCBI and ExPASY combined with some other analysis softwares were used. Results: The full length of this gene was 908 bp. Its coding region was 135 -771 bp, en-coding 212 ammo acids. The encoded protein didn't contain any kinds of subcellular localization se-quence. Consistency and similarity of the screened gene with that of Taenia solium GST were 93% and 96% respectively. Three major epitopes of GST: 33 -53 aa, 62 -68 aa, 179 ~ 184 aa were predicted to locate on the surface of GST spatial structure and were far away from each other. Conclusions; GST gene is screened from cDNA library of adult Taenia saginata. GST is predicted to be a cytosolic protein and has good application prospect for immunodiagnosis.%目的:分析牛带绦虫成虫谷胱甘肽S-转移酶(GST)基因结构并预测其编码蛋白的结构和功能.方法:利用生物信息学网站如NCBI和ExPASY系统中的生物信息学分析工具,并结合其它分析软件,分析该基因的结构并预测其编码蛋白质的结构和功能.结果:该基因全长908bp,编码区为135~771bp,编码212个氨基酸,无各种亚细胞定位序列;与猪带绦虫GST的一致性为93%,相似性为96%;预测3个主要的抗原表位33~53aa,62~68aa,179~184aa位于空间结构上相距较远的分子表面.结论:从牛带绦虫成虫Cdna文库中筛选出GST基因,预测为胞浆型蛋白,可能具有较好的免疫学诊断抗原应用前景.

  1. Meconium microbiome analysis identifies bacteria correlated with premature birth.

    Directory of Open Access Journals (Sweden)

    Alexandria N Ardissone

    Full Text Available Preterm birth is the second leading cause of death in children under the age of five years worldwide, but the etiology of many cases remains enigmatic. The dogma that the fetus resides in a sterile environment is being challenged by recent findings and the question has arisen whether microbes that colonize the fetus may be related to preterm birth. It has been posited that meconium reflects the in-utero microbial environment. In this study, correlations between fetal intestinal bacteria from meconium and gestational age were examined in order to suggest underlying mechanisms that may contribute to preterm birth.Meconium from 52 infants ranging in gestational age from 23 to 41 weeks was collected, the DNA extracted, and 16S rRNA analysis performed. Resulting taxa of microbes were correlated to clinical variables and also compared to previous studies of amniotic fluid and other human microbiome niches.Increased detection of bacterial 16S rRNA in meconium of infants of <33 weeks gestational age was observed. Approximately 61·1% of reads sequenced were classified to genera that have been reported in amniotic fluid. Gestational age had the largest influence on microbial community structure (R = 0·161; p = 0·029, while mode of delivery (C-section versus vaginal delivery had an effect as well (R = 0·100; p = 0·044. Enterobacter, Enterococcus, Lactobacillus, Photorhabdus, and Tannerella, were negatively correlated with gestational age and have been reported to incite inflammatory responses, suggesting a causative role in premature birth.This provides the first evidence to support the hypothesis that the fetal intestinal microbiome derived from swallowed amniotic fluid may be involved in the inflammatory response that leads to premature birth.

  2. Phosphoproteomics and bioinformatics analyses of spinal cord proteins in rats with morphine tolerance.

    Directory of Open Access Journals (Sweden)

    Wen-Jinn Liaw

    Full Text Available INTRODUCTION: Morphine is the most effective pain-relieving drug, but it can cause unwanted side effects. Direct neuraxial administration of morphine to spinal cord not only can provide effective, reliable pain relief but also can prevent the development of supraspinal side effects. However, repeated neuraxial administration of morphine may still lead to morphine tolerance. METHODS: To better understand the mechanism that causes morphine tolerance, we induced tolerance in rats at the spinal cord level by giving them twice-daily injections of morphine (20 µg/10 µL for 4 days. We confirmed tolerance by measuring paw withdrawal latencies and maximal possible analgesic effect of morphine on day 5. We then carried out phosphoproteomic analysis to investigate the global phosphorylation of spinal proteins associated with morphine tolerance. Finally, pull-down assays were used to identify phosphorylated types and sites of 14-3-3 proteins, and bioinformatics was applied to predict biological networks impacted by the morphine-regulated proteins. RESULTS: Our proteomics data showed that repeated morphine treatment altered phosphorylation of 10 proteins in the spinal cord. Pull-down assays identified 2 serine/threonine phosphorylated sites in 14-3-3 proteins. Bioinformatics further revealed that morphine impacted on cytoskeletal reorganization, neuroplasticity, protein folding and modulation, signal transduction and biomolecular metabolism. CONCLUSIONS: Repeated morphine administration may affect multiple biological networks by altering protein phosphorylation. These data may provide insight into the mechanism that underlies the development of morphine tolerance.

  3. BioJava: an open-source framework for bioinformatics

    OpenAIRE

    Holland, R. C. G.; Down, T. A.; Pocock, M.; Prlić, A.; Huen, D; James, K.; Foisy, S.; Dräger, A.; Yates, A; Heuer, M.; Schreiber, M. J.

    2008-01-01

    Summary: BioJava is a mature open-source project that provides a framework for processing of biological data. BioJava contains powerful analysis and statistical routines, tools for parsing common file formats and packages for manipulating sequences and 3D structures. It enables rapid bioinformatics application development in the Java programming language. Availability: BioJava is an open-source project distributed under the Lesser GPL (LGPL). BioJava can be downloaded from the BioJava website...

  4. Fighting against uncertainty: An essential issue in bioinformatics

    OpenAIRE

    Hamada, Michiaki

    2013-01-01

    Many bioinformatics problems, such as sequence alignment, gene prediction, phylogenetic tree estimation and RNA secondary structure prediction, are often affected by the "uncertainty" of a solution; that is, the probability of the solution is extremely small. This situation arises for estimation problems on high-dimensional discrete spaces in which the number of possible discrete solutions is immense. In the analysis of biological data or the development of prediction algorithms, this uncerta...

  5. Applications of Structural Bioinformatics for the Structural Genomics Era

    OpenAIRE

    Novotny, Marian

    2007-01-01

    Structural bioinformatics deals with the analysis, classification and prediction of three-dimensional structures of biomacromolecules. It is becoming increasingly important as the number of structures is growing rapidly. This thesis describes three studies concerned with protein-function prediction and two studies about protein structure validation. New protein structures are often compared to known structures to find out if they have a known fold, which may provide hints about their function...

  6. Concepts and introduction to RNA bioinformatics

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Hofacker, Ivo L.; Ruzzo, Walter L.

    2014-01-01

    RNA bioinformatics and computational RNA biology have emerged from implementing methods for predicting the secondary structure of single sequences. The field has evolved to exploit multiple sequences to take evolutionary information into account, such as compensating (and structure preserving) base...

  7. Challenge: A Multidisciplinary Degree Program in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Mudasser Fraz Wyne

    2006-06-01

    Full Text Available Bioinformatics is a new field that is poorly served by any of the traditional science programs in Biology, Computer science or Biochemistry. Known to be a rapidly evolving discipline, Bioinformatics has emerged from experimental molecular biology and biochemistry as well as from the artificial intelligence, database, pattern recognition, and algorithms disciplines of computer science. While institutions are responding to this increased demand by establishing graduate programs in bioinformatics, entrance barriers for these programs are high, largely due to the significant prerequisite knowledge which is required, both in the fields of biochemistry and computer science. Although many schools currently have or are proposing graduate programs in bioinformatics, few are actually developing new undergraduate programs. In this paper I explore the blend of a multidisciplinary approach, discuss the response of academia and highlight challenges faced by this emerging field.

  8. Bioinformatics clouds for big data manipulation

    OpenAIRE

    Dai Lin; Gao Xin; Guo Yan; Xiao Jingfa; Zhang Zhang

    2012-01-01

    Abstract As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and a...

  9. Bioinformatics for saffron (Crocus sativus L.) improvement

    OpenAIRE

    Ghulam A. PARRAY; Abdul G. Rather; Parvez Sofi; Shafiq A. Wani; Amjad M. Husaini; Asif B. Shikari; Javid I. Mir

    2009-01-01

    Saffron (Crocus sativus L.) is a sterile triploid plant and belongs to the Iridaceae (Liliales, Monocots). Its genome is of relatively large size and is poorly characterized. Bioinformatics can play an enormous technical role in the sequence-level structural characterization of saffron genomic DNA. Bioinformatics tools can also help in appreciating the extent of diversity of various geographic or genetic groups of cultivated saffron to infer relationships between groups and accessions. The ch...

  10. Agile parallel bioinformatics workflow management using Pwrake

    OpenAIRE

    Tanaka Masahiro; Sasaki Kensaku; Mishima Hiroyuki; Tatebe Osamu; Yoshiura Koh-ichiro

    2011-01-01

    Abstract Background In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environm...

  11. Bioinformatics Analysis on the Structure and Function of Malate Dehydrogenase Gene of Taenia solium%生物信息学法分析猪带绦虫苹果酸脱氢酶结构与功能

    Institute of Scientific and Technical Information of China (English)

    蓝磊; 廖兴江; 黄江; 戴佳琳

    2012-01-01

    目的:分析和预测猪带绦虫苹果酸脱氢酶的结构和特性,用于指导其生物学功能的实验研究.方法:利用美国国家生物技术信息中心和瑞士生物信息学研究所的蛋白分析专家系统中有关基因和蛋白的序列和结构信息分析的工具,结合Pcgene和Vector NTI suite生物信息学分析软件包,从猪带绦虫全长cDNA质粒文库中识别苹果酸脱氢酶基因及其编码区,分析、预测该基因编码的蛋白质的理化特性、翻译后的修饰位点、功能域、亚细胞定位、拓扑结构、二级结构、三维空间构象等.结果:该基因编码332个氨基酸,为全长基因.GenBank中与细粒棘球绦虫苹果酸脱氢酶序列同源性最高,理论分子量为36459.2 Da.预测编码蛋白无跨膜区,无二硫键,稳定性较好.与吸虫属的苹果酸脱氢酶进化关系最近.结论:应用生物信息方法从猪带绦虫成虫Cd-NA文库中筛选出了猪带绦虫核糖体Cdna全长序列并预测得到其结构与功能方面信息.%Objective: To analyze and predict the structure and characteristics of Taenia solium mal-ate dehydrogenase ( MDH) , and so as to guide the experimental research on biological function of MDH. Methods: Tools about informatics analyis on sequences and structures of gene and protein in protein analysis expert system of bioinformatic institute of Switzerland, and those of state biological and technology information center of USA, combined with Pcgene and Vector NTI suite bioinformatics soft-ware pakege were employed to screen Taenia solium MDH gene and encoding region from cDNA plas-mid library to analyze and predict physicochemical properties of its encoding protein, modification site after translation, function domains, subcelluar location, topological structure, secondary structure, and 3D conformation and so on. Results: This gene encoded 332 amino acids, and was a full length gene. It was the most homologues to Taenia echinococcus MDH in Gen

  12. 肿瘤相关巨噬细胞microRNA表达谱及生物信息学分析%Profile of microRNA expression in tumor associated macrophage and bioinformatics analysis

    Institute of Scientific and Technical Information of China (English)

    雷宇; 刘彦信; 葛晔华; 史娟; 郑德先

    2012-01-01

    Objective To investigate the profile of microRNA expression in tumor associated macrophage (TAM). Methods An xenograft mouse model was established with mouse breast cancer cell line 4T1. TAM were isolated from the tumor tissue. The microRNA expression profile was detected by using a microRNA chip assay. The result of chip assay was validated by real-time PCR and analyzed by bioinformatics. The peritoneal macrophage was used as control. Results There were significant changes in 59 microRNAs' expression in TAM as compared with the negative control. Among these microRNAs, 23 microRNAs' expression was up regulated and 36 were down regulated. Real-time PCR verified the expression of miR-146a, miR-222, miR-31 and miR-877, these results are in line with chip experiment. These microRNAs participate in the regulation of various signaling pathways. Conclusions Profile of microRNA expression and bioinformatics analysis suggeste microRNA plays an important role in the regulation of TAM differentiation.%目的 研究肿瘤相关巨噬细胞( TAM) microRNA的表达谱.方法 建立小鼠乳腺癌细胞系4T1移植瘤模型,从移植瘤组织中分离TAM,用基因芯片检测microRNA表达谱,实时荧光定量PCR( real-time PCR)验证芯片结果并进行生物信息学分析,以小鼠腹腔巨噬细胞(PEC)为阴性对照.结果 与阴性对照细胞相比,TAM中有59个microRNAs表达量出现显著变化,其中23个microRNAs表达上调,有36个microRNAs表达下调;实时荧光定量PCR对miR-146a、miR-222、miR-31和miR-877的表达进行了验证,其结果与基因芯片检测结果一致;这些microRNAs参与了多个信号通路的调控.结论 microRNA表达谱及生物信息学分析表明microRNA在TAM分化过程的调控中有重要作用.

  13. Bioinformatics: Cheap and robust method to explore biomaterial from Indonesia biodiversity

    Science.gov (United States)

    Widodo

    2015-02-01

    Indonesia has a huge amount of biodiversity, which may contain many biomaterials for pharmaceutical application. These resources potency should be explored to discover new drugs for human wealth. However, the bioactive screening using conventional methods is very expensive and time-consuming. Therefore, we developed a methodology for screening the potential of natural resources based on bioinformatics. The method is developed based on the fact that organisms in the same taxon will have similar genes, metabolism and secondary metabolites product. Then we employ bioinformatics to explore the potency of biomaterial from Indonesia biodiversity by comparing species with the well-known taxon containing the active compound through published paper or chemical database. Then we analyze drug-likeness, bioactivity and the target proteins of the active compound based on their molecular structure. The target protein was examined their interaction with other proteins in the cell to determine action mechanism of the active compounds in the cellular level, as well as to predict its side effects and toxicity. By using this method, we succeeded to screen anti-cancer, immunomodulators and anti-inflammation from Indonesia biodiversity. For example, we found anticancer from marine invertebrate by employing the method. The anti-cancer was explore based on the isolated compounds of marine invertebrate from published article and database, and then identified the protein target, followed by molecular pathway analysis. The data suggested that the active compound of the invertebrate able to kill cancer cell. Further, we collect and extract the active compound from the invertebrate, and then examined the activity on cancer cell (MCF7). The MTT result showed that the methanol extract of marine invertebrate was highly potent in killing MCF7 cells. Therefore, we concluded that bioinformatics is cheap and robust way to explore bioactive from Indonesia biodiversity for source of drug and another

  14. Analysis of Maize Crop Leaf using Multivariate Image Analysis for Identifying Soil Deficiency

    Directory of Open Access Journals (Sweden)

    S. Sridevy

    2014-11-01

    Full Text Available Image processing analysis for the soil deficiency identification has become an active area of research in this study. The changes in the color of the leaves are used to analyze and identify the deficiency of soil nutrients such as Nitrogen (N, Phosphorus (P and potassium (K by digital color image analysis. This research study focuses on the image analysis of the maize crop leaf using multivariate image analysis. In this proposed novel approach, initially, a color transformation for the input RGB image is formed and this RGB is converted to HSV because RGB is ideal for color generation but HSV is very suitable for color perception. Then green pixels are masked and removed using specific threshold value by applying histogram equalization. This masking approach is done through specific customized filtering approach which exclusively filters the green color of the leaf. After the filtering step, only the deficiency part of the leaf is taken for consideration. Then, a histogram generation is carried out for the deficiency part of the leaf. Then, Multivariate Image Analysis approach using Independent Component Analysis (ICA is carried out to extract a reference eigenspace from a matrix built by unfolding color data from the deficiency part. Test images are also unfolded and projected onto the reference eigenspace and the result is a score matrix which is used to compute nutrient deficiency based on the T2 statistic. In addition, a multi-resolution scheme by scaling down process is carried out to speed up the process. Finally, based on the training samples, the soil deficiency is identified based on the color of the maize crop leaf.

  15. Overview of Random Forest Methodology and Practical Guidance with Emphasis on Computational Biology and Bioinformatics

    OpenAIRE

    Boulesteix, Anne-Laure; Janitza, Silke; Kruppa, Jochen; König, Inke R.

    2012-01-01

    The Random Forest (RF) algorithm by Leo Breiman has become a standard data analysis tool in bioinformatics. It has shown excellent performance in settings where the number of variables is much larger than the number of observations, can cope with complex interaction structures as well as highly correlated variables and returns measures of variable importance. This paper synthesizes ten years of RF development with emphasis on applications to bioinformatics and computational biology. Specia...

  16. Best practices in bioinformatics training for life scientists

    DEFF Research Database (Denmark)

    Via, Allegra; Blicher, Thomas; Bongcam-Rudloff, Erik;

    2013-01-01

    to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse......The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians...... their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes...

  17. Bioinformatics for whole-genome shotgun sequencing of microbial communities.

    Directory of Open Access Journals (Sweden)

    Kevin Chen

    2005-07-01

    Full Text Available The application of whole-genome shotgun sequencing to microbial communities represents a major development in metagenomics, the study of uncultured microbes via the tools of modern genomic analysis. In the past year, whole-genome shotgun sequencing projects of prokaryotic communities from an acid mine biofilm, the Sargasso Sea, Minnesota farm soil, three deep-sea whale falls, and deep-sea sediments have been reported, adding to previously published work on viral communities from marine and fecal samples. The interpretation of this new kind of data poses a wide variety of exciting and difficult bioinformatics problems. The aim of this review is to introduce the bioinformatics community to this emerging field by surveying existing techniques and promising new approaches for several of the most interesting of these computational problems.

  18. 2nd Colombian Congress on Computational Biology and Bioinformatics

    CERN Document Server

    Cristancho, Marco; Isaza, Gustavo; Pinzón, Andrés; Rodríguez, Juan

    2014-01-01

    This volume compiles accepted contributions for the 2nd Edition of the Colombian Computational Biology and Bioinformatics Congress CCBCOL, after a rigorous review process in which 54 papers were accepted for publication from 119 submitted contributions. Bioinformatics and Computational Biology are areas of knowledge that have emerged due to advances that have taken place in the Biological Sciences and its integration with Information Sciences. The expansion of projects involving the study of genomes has led the way in the production of vast amounts of sequence data which needs to be organized, analyzed and stored to understand phenomena associated with living organisms related to their evolution, behavior in different ecosystems, and the development of applications that can be derived from this analysis.  .

  19. 牦牛CYGB基因CDS区克隆与生物信息学分析%Cloning and Bioinformatics Analysis on CDS of CYGB Gene in Yak

    Institute of Scientific and Technical Information of China (English)

    孙雪婧; 杜晓华; 杨孝朴; 罗玉柱; 刘霞

    2014-01-01

    Objective In order to enrich basic data in yak CYGB gene, CDS region of yak CYGB gene was cloned and analyzed by bioinformatics method. [Method] Total RNA of yak hippocampus tissue was extracted and reverse transcribed into cDNA by RT-PCR technology. Specific primers were designed according to cDNA sequence of cattle CYGB gene in the GenBank (GenBank accession No.:DV874786.1) by online software Primer 3.0. The CDS region and part of 5′UTR and 3′UTR in yak CYGB gene were cloned from yak hippocampus total RNA by PCR amplification, TA cloning and nucleic acid sequencing technology. The primary structure, secondary structure, tertiary structure, physicochemical properties, homology were analyzed and phylogenetic tree of CYGB was constructed by online software like ProtParam, PredictProtein, SWISS-MODEL and Lasergene7.1 software package. The three-dimensional structure was modified and output by PyMol software. The protein subcellular localization was predicted by online subcellular localization tool PSORT II Prediction, and the protein function was predicted by Protfun software.[Result]The 650 bp length CYGB gene in yak was got by cloning, including the 573 bp length CDS region (GenBank accession No.:KF669898), and the bases composition were 20.59%A, 16.40%T, 33.33%G, 29.67%C, encoding 190 amino acids. Alignment with CDS and amino acid sequence of cattle CYGB gene, four base mutations were found and amino acid was not mutated, four mutations are synonymous mutations. The formula of protein encoded by CYGB gene in yak was C964H1513N263O278S7, and the molecular weight was 21.5 kD, the theory isoelectric point was 6.32, the extinction coefficient was 24075, the instability index was 48.43, the aliphatic index was 83.63, and the grand average of hydropathicity was-0.301. It was an unstable and soluble protein. Its estimated half-life is 30 hours in mammal reticulocyte. The secondary structure of CYGB was mainlyα-helices and random coil,α-helices was 64.21%and random

  20. Bioinformatics: Current practice and future challenges for life science education.

    Science.gov (United States)

    Hack, Catherine; Kendall, Gary

    2005-03-01

    It is widely predicted that the application of high-throughput technologies to the quantification and identification of biological molecules will cause a paradigm shift in the life sciences. However, if the biosciences are to evolve from a predominantly descriptive discipline to an information science, practitioners will require enhanced skills in mathematics, computing, and statistical analysis. Universities have responded to the widely perceived skills gap primarily by developing masters programs in bioinformatics, resulting in a rapid expansion in the provision of postgraduate bioinformatics education. There is, however, a clear need to improve the quantitative and analytical skills of life science undergraduates. This article reviews the response of academia in the United Kingdom and proposes the learning outcomes that graduates should achieve to cope with the new biology. While the analysis discussed here uses the development of bioinformatics education in the United Kingdom as an illustrative example, it is hoped that the issues raised will resonate with all those involved in curriculum development in the life sciences. PMID:21638550

  1. 番茄查尔酮合成酶基因的鉴定及生物信息学分析%Identification and Bioinformatics Analysis of Chalcone Synthase Genes in Tomato

    Institute of Scientific and Technical Information of China (English)

    阮美颖; 杨悦俭; 万红建; 叶青静; 王荣青; 姚祝平; 周国治; 俞锞; 袁伟; 刘云飞

    2013-01-01

      类黄酮(Flavonoids)是植物体内一类重要的次生代谢产物,它以结合态(黄酮苷)或自由态(黄酮苷元)形式存在于水果、蔬菜、豆类和茶叶等许多植物中,对植物的生长发育有着重要的调节作用。查尔酮合成酶(Chalcone synthase, CHS, EC2.3.1.74)是植物类黄酮合成途径的第一个关键酶,在调控类黄酮的生物合成以及类黄酮的成分起着决定作用。本研究基于番茄全基因组测序数据,利用生物信息学方法,鉴定了查尔酮合成酶基因家族成员,分析其内含子-外显子的结构特征、系统发育关系,序列结构的保守性以及染色体上的分布。研究表明:查尔酮合成酶(SlCHS)是含有8个成员的多家族基因,蛋白质序列编码位于160(SlCHS05)~438(SlCHS08)个氨基酸之间;相似性在33.7%(SlCHS02和SlCHS06)~92.0%(SlCHS04和SlCHS07)之间,表明这些序列之间具有较高的遗传多样性;此外,结构分析发现这些基因均含有较少的内含子(0~2个);序列比对表明这些基因具有较高的保守性;它们不均匀分布在番茄的1、5、6、9和12号染色体上。该研究不仅有助于未来了解该基因家族的进化起源提供参考,而且可为我们进一步分析该基因家族成员的功能奠定基础。%Flavonoids are a kind of important secondary metabolites in plants. Usually, it was found in fruits, vegetables, beans, tea and many other plants as combination (flavonoid glycosides) or free states (flavonoid glyco-sides) form. It has important role in regulating plant growth and development. Chalcone synthase, the first key synthase during the process of flavonoids synthesis, plays an important role in plant growth and development. Based on the whole tomato genome sequence, we investigated gene members of the chalcone synthase family with genome database and bioinformatics analysis. We identified 8 chalcone synthase genes with protein sequence length varying

  2. RNA-seq analysis to identify novel roles of scleraxis during embryonic mouse heart valve remodeling.

    Directory of Open Access Journals (Sweden)

    Damien N Barnette

    Full Text Available Heart valve disease affects up to 30% of the population and has been shown to have origins during embryonic development. Valvulogenesis begins with formation of endocardial cushions in the atrioventricular canal and outflow tract regions. Subsequently, endocardial cushions remodel, elongate and progressively form mature valve structures composed of a highly organized connective tissue that provides the necessary biomechanical function throughout life. While endocardial cushion formation has been well studied, the processes required for valve remodeling are less well understood. The transcription factor Scleraxis (Scx is detected in mouse valves from E15.5 during initial stages of remodeling, and expression remains high until birth when formation of the highly organized mature structure is complete. Heart valves from Scx-/- mice are abnormally thick and develop fibrotic phenotypes similar to human disease by juvenile stages. These phenotypes begin around E15.5 and are associated with defects in connective tissue organization and valve interstitial cell differentiation. In order to understand the etiology of this phenotype, we analyzed the transcriptome of remodeling valves isolated from E15.5 Scx-/- embryos using RNA-seq. From this, we have identified a profile of protein and non-protein mRNAs that are dependent on Scx function and using bioinformatics we can predict the molecular functions and biological processes affected by these genes. These include processes and functions associated with gene regulation (methyltransferase activity, DNA binding, Notch signaling, vitamin A metabolism (retinoic acid biosynthesis and cellular development (cell morphology, cell assembly and organization. In addition, several mRNAs are affected by alternative splicing events in the absence of Scx, suggesting additional roles in post-transcriptional modification. In summary, our findings have identified transcriptome profiles from abnormal heart valves isolated

  3. Website for avian flu information and bioinformatics

    Institute of Scientific and Technical Information of China (English)

    GAO; George; Fu

    2009-01-01

    Highly pathogenic influenza A virus H5N1 has spread out worldwide and raised the public concerns. This increased the output of influenza virus sequence data as well as the research publication and other reports. In order to fight against H5N1 avian flu in a comprehensive way, we designed and started to set up the Website for Avian Flu Information (http://www.avian-flu.info) from 2004. Other than the influenza virus database available, the website is aiming to integrate diversified information for both researchers and the public. From 2004 to 2009, we collected information from all aspects, i.e. reports of outbreaks, scientific publications and editorials, policies for prevention, medicines and vaccines, clinic and diagnosis. Except for publications, all information is in Chinese. Till April 15, 2009, the cumulative news entries had been over 2000 and research papers were approaching 5000. By using the curated data from Influenza Virus Resource, we have set up an influenza virus sequence database and a bioinformatic platform, providing the basic functions for the sequence analysis of influenza virus. We will focus on the collection of experimental data and results as well as the integration of the data from the geological information system and avian influenza epidemiology.

  4. Website for avian flu information and bioinformatics

    Institute of Scientific and Technical Information of China (English)

    LIU Di; LIU Quan-He; WU Lin-Huan; LIU Bin; WU Jun; LAO Yi-Mei; LI Xiao-Jing; GAO George Fu; MA Jun-Cai

    2009-01-01

    Highly pathogenic influenza A virus H5N1 has spread out worldwide and raised the public concerns. This increased the output of influenza virus sequence data as well as the research publication and other reports. In order to fight against H5N1 avian flu in a comprehensive way, we designed and started to set up the Website for Avian Flu Information (http://www.avian-flu.info) from 2004. Other than the influenza virus database available, the website is aiming to integrate diversified information for both researchers and the public. From 2004 to 2009, we collected information from all aspects, i.e. reports of outbreaks, scientific publications and editorials, policies for prevention, medicines and vaccines, clinic and diagnosis. Except for publications, all information is in Chinese. Till April 15, 2009, the cumulative news entries had been over 2000 and research papers were approaching 5000. By using the curated data from Influenza Virus Resource, we have set up an influenza virus sequence database and a bioin-formatic platform, providing the basic functions for the sequence analysis of influenza virus. We will focus on the collection of experimental data and results as well as the integration of the data from the geological information system and avian influenza epidemiology.

  5. Determination of the mechanism of action of repetitive halothane exposure on rat brain tissues using a combined method of microarray gene expression profiling and bioinformatics analysis.

    Science.gov (United States)

    Wang, Jiansheng; Yang, Xiaojun; Xiao, Huan; Kong, Jianqiang; Bing, Miao

    2015-12-01

    The present study aimed to investigate the gene expression profiles of rats brain tissues treated with halothane compared with untreated controls to improve current understanding of the mechanism of action of the inhaled anesthetic. The GSE357 gene expression profile was dowloaded from the Gene Expression Omnibus database, and included six gene chips of samples repeatedly exposed to halothane and 12 gene chips of untreated controls. The differentially expressed genes (DEGs) between these two groups were identified using the Limma package in R language. Subsequently, the Database for Annotation, Visualization and Integrated Discovery was used to annotate the function of these DEGs. In addition, the most significantly upregulated gene and downregulated gene were annotated, to reveal the functional interactions with other associated genes, in FuncBase database. A total of 44 DEGs were obtained between The control and halothane exposure samples. Following Gene Ontology functional classification, these DEGs were found to be involved predominantly in the circulatory system, regulation of cell proliferation and response to endogenous stimulus and corticosteroid stimulus processes. KRT31 and HMGCS2, which were identified as the most significantly downregulated and upregulated DEGs, respectively, were associated with the lipid metabolic process and T cell activation, respectively. These results provided a basis for the development of improved inhalational anesthetics with minimal side effects and are essential for optimization of inhaled anesthetic techniques for advanced surgical procedures. PMID:26497548

  6. Fundamentals of bioinformatics and computational biology methods and exercises in matlab

    CERN Document Server

    Singh, Gautam B

    2015-01-01

    This book offers comprehensive coverage of all the core topics of bioinformatics, and includes practical examples completed using the MATLAB bioinformatics toolbox™. It is primarily intended as a textbook for engineering and computer science students attending advanced undergraduate and graduate courses in bioinformatics and computational biology. The book develops bioinformatics concepts from the ground up, starting with an introductory chapter on molecular biology and genetics. This chapter will enable physical science students to fully understand and appreciate the ultimate goals of applying the principles of information technology to challenges in biological data management, sequence analysis, and systems biology. The first part of the book also includes a survey of existing biological databases, tools that have become essential in today’s biotechnology research. The second part of the book covers methodologies for retrieving biological information, including fundamental algorithms for sequence compar...

  7. Bioinformatics resources for cancer research with an emphasis on gene function and structure prediction tools

    Directory of Open Access Journals (Sweden)

    Daisuke Kihara

    2006-01-01

    Full Text Available The immensely popular fields of cancer research and bioinformatics overlap in many different areas, e.g. large data repositories that allow for users to analyze data from many experiments (data handling, databases, pattern mining, microarray data analysis, and interpretation of proteomics data. There are many newly available resources in these areas that may be unfamiliar to most cancer researchers wanting to incorporate bioinformatics tools and analyses into their work, and also to bioinformaticians looking for real data to develop and test algorithms. This review reveals the interdependence of cancer research and bioinformatics, and highlight the most appropriate and useful resources available to cancer researchers. These include not only public databases, but general and specific bioinformatics tools which can be useful to the cancer researcher. The primary foci are function and structure prediction tools of protein genes. The result is a useful reference to cancer researchers and bioinformaticians studying cancer alike.

  8. Wnt-signalling pathways and microRNAs network in carcinogenesis: experimental and bioinformatics approaches.

    Science.gov (United States)

    Onyido, Emenike K; Sweeney, Eloise; Nateri, Abdolrahman Shams

    2016-01-01

    Over the past few years, microRNAs (miRNAs) have not only emerged as integral regulators of gene expression at the post-transcriptional level but also respond to signalling molecules to affect cell function(s). miRNAs crosstalk with a variety of the key cellular signalling networks such as Wnt, transforming growth factor-β and Notch, control stem cell activity in maintaining tissue homeostasis, while if dysregulated contributes to the initiation and progression of cancer. Herein, we overview the molecular mechanism(s) underlying the crosstalk between Wnt-signalling components (canonical and non-canonical) and miRNAs, as well as changes in the miRNA/Wnt-signalling components observed in the different forms of cancer. Furthermore, the fundamental understanding of miRNA-mediated regulation of Wnt-signalling pathway and vice versa has been significantly improved by high-throughput genomics and bioinformatics technologies. Whilst, these approaches have identified a number of specific miRNA(s) that function as oncogenes or tumour suppressors, additional analyses will be necessary to fully unravel the links among conserved cellular signalling pathways and miRNAs and their potential associated components in cancer, thereby creating therapeutic avenues against tumours. Hence, we also discuss the current challenges associated with Wnt-signalling/miRNAs complex and the analysis using the biomedical experimental and bioinformatics approaches. PMID:27590724

  9. Clone and Bioinformatics Analysis of Chinese-Belgium Rabbit Metallothionein-1 MT1, MT2 and MT3 Genes CDS Region

    Directory of Open Access Journals (Sweden)

    Lai Songjia

    2012-01-01

    Full Text Available Metallothioneins (MTs play important roles in mental ion metabolism, detoxication of heavy mental, clearance of Reactive Oxygen Species (ROS. In the studies, Coding Sequences (CDS of MT1, MT2 and MT3 genes from Chinese-Belgium rabbit were cloned and sequenced. The results showed: CDS length of MT1, MT2 and MT3 was 186, 186 and 201 bp, respectively encoded 61, 61 and 66 amino acids, respectively. Amino acid sequence of MTs includes 7 typical Cys-Xaa-Cys motifs and 20 cysteines which form 10 disulfide-bonds. Similarity analysis showed that the similarity of MT1 and MT2 reached to 87.6%; the similarity of MT2 and MT3 was 78.0%; the similarity of MT3 and MT1 was 76.3%. Alignment and cluster analysis of MTs genes from different species suggest MTs are high conserved but different subtype of MTs genes have separated and formed in the early stages of the evolution of species and then they individually evolve.

  10. GOBLET: the Global Organisation for Bioinformatics Learning, Education and Training.

    Science.gov (United States)

    Attwood, Teresa K; Atwood, Teresa K; Bongcam-Rudloff, Erik; Brazas, Michelle E; Corpas, Manuel; Gaudet, Pascale; Lewitter, Fran; Mulder, Nicola; Palagi, Patricia M; Schneider, Maria Victoria; van Gelder, Celia W G

    2015-04-01

    In recent years, high-throughput technologies have brought big data to the life sciences. The march of progress has been rapid, leaving in its wake a demand for courses in data analysis, data stewardship, computing fundamentals, etc., a need that universities have not yet been able to satisfy--paradoxically, many are actually closing "niche" bioinformatics courses at a time of critical need. The impact of this is being felt across continents, as many students and early-stage researchers are being left without appropriate skills to manage, analyse, and interpret their data with confidence. This situation has galvanised a group of scientists to address the problems on an international scale. For the first time, bioinformatics educators and trainers across the globe have come together to address common needs, rising above institutional and international boundaries to cooperate in sharing bioinformatics training expertise, experience, and resources, aiming to put ad hoc training practices on a more professional footing for the benefit of all. PMID:25856076

  11. Bioinformatic Analysis and Prediction of miRNA-122a Target Genes%miRNA-122a靶基因预测及生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    蒋永容; 李增鹏; 王东; 王阁; 陈川; 张志敏; 郑继军; 许文; 罗美林; 戴楠; 李梦侠; 杨宇馨

    2011-01-01

    目的:利用基因芯片技术分析肝癌HepG2细胞和正常肝上皮LO2细胞中miRNA的表达,并对HepG2细胞中低表达的miRNA-122a进行靶基因预测及相关生物信息学分析,为以miRNA-122a为靶点的基因治疗提供理论和实验基础.方法:利用基因芯片技术检测HepG2细胞和LO2细胞中miRNA-122a表达水平,通过生物信息学预测miRNA-122a的靶基因,并对其靶基因进行功能富集分析(GO-analysis)、信号转导通路富集分析(Pathway-analysis)和蛋白质相互作用网络分析.结果:与LO2细胞比较,miRNA-122a在HepG2细胞中呈低表达.miRNA-122a预测靶基因有1 104个,其靶基因集合功能分别富集于碳水化合物生物合成、核苷酸代谢、细胞因子受体结合、细胞周期等生物学过程(P<0.001);信号转导通路显著富集于JAK-STAT信号通路、Wnt信号通路、MAPK信号通路、ErbB信号通路、细胞周期等信号转导通路(P<0.001).结论:miRNA-122a在HepG2细胞中呈现低表达,miRNA-122a预测靶基因集合显著富集在与肿瘤发生相关的信号通路中.%Objective: The present study aimed to investigate miRNA expression patterns in hepatocellular carcinoma ( HepG2 ) and normol liver epithelial (LO2) cell lines.Another aim was to bioinformatically analyze as well as predict the target genes of miR-122a to provide both theoretical and experimental basis for gene therapy.Methods: The expression levels of miRNA- 122a in HepG2 and LO2 cells were detected using the gene chip technology.The bioinformatic analysis of the target genes of miRNA-122a involved enrichment ( gene ontology ), signal transduction pathway enrichment, and protein interaction network analyses.Results: miRNA-122a expression significantly decreased in HepG2 cells, compared with LO2 cells.The number of miRNA-122a target genes was 1104.The functions of these target genes were enriched in carbohydrate biosynthesis, nucleotide metabolism, cytokine receptor binding, cell cycle, and other

  12. Assessing Reliability of Cellulose Hydrolysis Models to Support Biofuel Process Design – Identifiability and Uncertainty Analysis

    DEFF Research Database (Denmark)

    Sin, Gürkan; Meyer, Anne S.; Gernaey, Krist

    2010-01-01

    The reliability of cellulose hydrolysis models is studied using the NREL model. An identifiability analysis revealed that only 6 out of 26 parameters are identifiable from the available data (typical hydrolysis experiments). Attempting to identify a higher number of parameters (as done in the...

  13. Structural Bioinformatics and Protein Docking Analysis of the Molecular Chaperone-Kinase Interactions: Towards Allosteric Inhibition of Protein Kinases by Targeting the Hsp90-Cdc37 Chaperone Machinery

    Directory of Open Access Journals (Sweden)

    Gennady Verkhivker

    2013-11-01

    Full Text Available A fundamental role of the Hsp90-Cdc37 chaperone system in mediating maturation of protein kinase clients and supporting kinase functional activity is essential for the integrity and viability of signaling pathways involved in cell cycle control and organism development. Despite significant advances in understanding structure and function of molecular chaperones, the molecular mechanisms and guiding principles of kinase recruitment to the chaperone system are lacking quantitative characterization. Structural and thermodynamic characterization of Hsp90-Cdc37 binding with protein kinase clients by modern experimental techniques is highly challenging, owing to a transient nature of chaperone-mediated interactions. In this work, we used experimentally-guided protein docking to probe the allosteric nature of the Hsp90-Cdc37 binding with the cyclin-dependent kinase 4 (Cdk4 kinase clients. The results of docking simulations suggest that the kinase recognition and recruitment to the chaperone system may be primarily determined by Cdc37 targeting of the N-terminal kinase lobe. The interactions of Hsp90 with the C-terminal kinase lobe may provide additional “molecular brakes” that can lock (or unlock kinase from the system during client loading (release stages. The results of this study support a central role of the Cdc37 chaperone in recognition and recruitment of the kinase clients. Structural analysis may have useful implications in developing strategies for allosteric inhibition of protein kinases by targeting the Hsp90-Cdc37 chaperone machinery.

  14. Comparative Bioinformatics and Experimental Analysis of the Intergenic Regulatory Regions of Bacillus cereus hbl and nhe Enterotoxin Operons and the Impact of CodY on Virulence Heterogeneity.

    Science.gov (United States)

    Böhm, Maria-Elisabeth; Krey, Viktoria M; Jeßberger, Nadja; Frenzel, Elrike; Scherer, Siegfried

    2016-01-01

    Bacillus cereus is a food contaminant with greatly varying enteropathogenic potential. Almost all known strains harbor the genes for at least one of the three enterotoxins Nhe, Hbl, and CytK. While some strains show no cytotoxicity, others have caused outbreaks, in rare cases even with lethal outcome. The reason for these differences in cytotoxicity is unknown. To gain insight into the origin of enterotoxin expression heterogeneity in different strains, the architecture and role of 5' intergenic regions (5' IGRs) upstream of the nhe and hbl operons was investigated. In silico comparison of 142 strains of all seven phylogenetic groups of B. cereus sensu lato proved the presence of long 5' IGRs upstream of the nheABC and hblCDAB operons, which harbor recognition sites for several transcriptional regulators, including the virulence regulator PlcR, redox regulators ResD and Fnr, the nutrient-sensitive regulator CodY as well as the master regulator for biofilm formation SinR. By determining transcription start sites, unusually long 5' untranslated regions (5' UTRs) upstream of the nhe and hbl start codons were identified, which are not present upstream of cytK-1 and cytK-2. Promoter fusions lacking various parts of the nhe and hbl 5' UTR in B. cereus INRA C3 showed that the entire 331 bp 5' UTR of nhe is necessary for full promoter activity, while the presence of the complete 606 bp hbl 5' UTR lowers promoter activity. Repression was caused by a 268 bp sequence directly upstream of the hbl transcription start. Luciferase activity of reporter strains containing nhe and hbl 5' IGR lux fusions provided evidence that toxin gene transcription is upregulated by the depletion of free amino acids. Electrophoretic mobility shift assays showed that the branched-chain amino acid sensing regulator CodY binds to both nhe and hbl 5' UTR downstream of the promoter, potentially acting as a nutrient-responsive roadblock repressor of toxin gene transcription. PlcR binding sites are

  15. Comparative Bioinformatics and Experimental Analysis of the Intergenic Regulatory Regions of Bacillus cereus hbl and nhe Enterotoxin Operons and the Impact of CodY on Virulence Heterogeneity

    Science.gov (United States)

    Böhm, Maria-Elisabeth; Krey, Viktoria M.; Jeßberger, Nadja; Frenzel, Elrike; Scherer, Siegfried

    2016-01-01

    Bacillus cereus is a food contaminant with greatly varying enteropathogenic potential. Almost all known strains harbor the genes for at least one of the three enterotoxins Nhe, Hbl, and CytK. While some strains show no cytotoxicity, others have caused outbreaks, in rare cases even with lethal outcome. The reason for these differences in cytotoxicity is unknown. To gain insight into the origin of enterotoxin expression heterogeneity in different strains, the architecture and role of 5′ intergenic regions (5′ IGRs) upstream of the nhe and hbl operons was investigated. In silico comparison of 142 strains of all seven phylogenetic groups of B. cereus sensu lato proved the presence of long 5′ IGRs upstream of the nheABC and hblCDAB operons, which harbor recognition sites for several transcriptional regulators, including the virulence regulator PlcR, redox regulators ResD and Fnr, the nutrient-sensitive regulator CodY as well as the master regulator for biofilm formation SinR. By determining transcription start sites, unusually long 5′ untranslated regions (5′ UTRs) upstream of the nhe and hbl start codons were identified, which are not present upstream of cytK-1 and cytK-2. Promoter fusions lacking various parts of the nhe and hbl 5′ UTR in B. cereus INRA C3 showed that the entire 331 bp 5′ UTR of nhe is necessary for full promoter activity, while the presence of the complete 606 bp hbl 5′ UTR lowers promoter activity. Repression was caused by a 268 bp sequence directly upstream of the hbl transcription start. Luciferase activity of reporter strains containing nhe and hbl 5′ IGR lux fusions provided evidence that toxin gene transcription is upregulated by the depletion of free amino acids. Electrophoretic mobility shift assays showed that the branched-chain amino acid sensing regulator CodY binds to both nhe and hbl 5′ UTR downstream of the promoter, potentially acting as a nutrient-responsive roadblock repressor of toxin gene transcription. Plc

  16. Bioinformatics analysis of NAC gene family in peach%桃NAC基因家族生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    张春华; 上官凌飞; 俞明亮; 张彦苹; 马瑞娟

    2012-01-01

    NAC基因家族是最大的植物特有的转录因子家族之一,因在植物发育和逆境应答过程中起着多样的作用而被广泛关注.为进一步进行桃NAC家族基因鉴定、功能分析等研究提供基础信息,采用生物信息学方法预测了桃NAC基因家族成员数目、在基因组骨架上分布、表达模式、假定蛋白质结构和亚族分类.预测结果显示桃NAC基因家族包含115个假定NAC蛋白质,被分为17个业族,且与拟南芥中NAC家族基因具有一定的相似性;1个NAC基因分布在11号骨架上,其余分布在1~8号基因组骨架上;对一级结构的分析结果显示桃NAC家族蛋白质分子量和氨基酸数目成正相关,绝大多数是亲水氨基酸,各亚族间等电点没有规律;115个蛋白质的二级结构全部以无规则卷曲为主要构成元件,且它们的三级结构大部分相似.在果皮中表达的NAC家族基因数最多,达到75%;在花芽中表达的NAC家族基因数较少,为1%.%The NAC family genes constitute one of the largest families of plant-specific transcription factors and are known to possess diverse roles in plant development and in the recognition of environmental stimuli. In order to offer basic information for further studies on the identification and function analysis of NAC family genes in peach, the number of members in NAC gene family, the distribution on the scaffold, the structure of protein, their expression pattern, as well as phy-logeny classification were predicted. The results showed that NAC gene family contained 115 predicted proteins in peach and were clustered into seventeen subfamilies. This indicated there was certain similarity in NAC genes between Primus per-sica and Arabidopsis thaliana. The results of scaffold distribution revealed that one NAC gene located on the number 11 scaffold , and the others located on scaffolds 1 to 8 of peach genome. The physico-chemical analysis revealed that their molecular weight had a

  17. Identifying significant genetic regulatory networks in the prostate cancer from microarray data based on transcription factor analysis and conditional independency

    Directory of Open Access Journals (Sweden)

    Yeh Cheng-Yu

    2009-12-01

    Full Text Available Abstract Background Prostate cancer is a world wide leading cancer and it is characterized by its aggressive metastasis. According to the clinical heterogeneity, prostate cancer displays different stages and grades related to the aggressive metastasis disease. Although numerous studies used microarray analysis and traditional clustering method to identify the individual genes during the disease processes, the important gene regulations remain unclear. We present a computational method for inferring genetic regulatory networks from micorarray data automatically with transcription factor analysis and conditional independence testing to explore the potential significant gene regulatory networks that are correlated with cancer, tumor grade and stage in the prostate cancer. Results To deal with missing values in microarray data, we used a K-nearest-neighbors (KNN algorithm to determine the precise expression values. We applied web services technology to wrap the bioinformatics toolkits and databases to automatically extract the promoter regions of DNA sequences and predicted the transcription factors that regulate the gene expressions. We adopt the microarray datasets consists of 62 primary tumors, 41 normal prostate tissues from Stanford Microarray Database (SMD as a target dataset to evaluate our method. The predicted results showed that the possible biomarker genes related to cancer and denoted the androgen functions and processes may be in the development of the prostate cancer and promote the cell death in cell cycle. Our predicted results showed that sub-networks of genes SREBF1, STAT6 and PBX1 are strongly related to a high extent while ETS transcription factors ELK1, JUN and EGR2 are related to a low extent. Gene SLC22A3 may explain clinically the differentiation associated with the high grade cancer compared with low grade cancer. Enhancer of Zeste Homolg 2 (EZH2 regulated by RUNX1 and STAT3 is correlated to the pathological stage

  18. Provenance of e-Science Experiments - experience from Bioinformatics

    OpenAIRE

    Greenwood, M.; Goble, C.A.; Stevens, R. D.; Zhao, J.(Central China Normal University (HZNU), Wuhan, 430079, China); Addis, M; Marvin, D; Moreau, L; Oinn, T.

    2003-01-01

    Like experiments performed at a laboratory bench, the data associated with an e-Science experiment are of reduced value if other scientists are not able to identify the origin, or provenance, of those data. Provenance information is essential if experiments are to be validated and verified by others, or even by those who originally performed them. In this article, we give an overview of our initial work on the provenance of bioinformatics e-Science experiments within myGrid. We use two kinds ...

  19. Bioinformatics analysis and clone of hERGIC3 gene related with newly-diagnosed lung cancer%新肺癌相关基因hERGIC3的克隆与生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    耿娜娜; 吴明松; 郑翔; 刘兴宇; 李学英

    2014-01-01

    目的:构建hERGIC3基因的原核表达载体,并对其进行生物信息学分析,以期全面地了解hERGIC3蛋白的生物学功能。方法采用实时半定量聚合酶链反应(RT-PCR)方法获取人类hERGIC3基因的开放阅读框(ORF)框DNA序列插入pGEM-T Easy载体;选择多个软件对hERGIC3蛋白进行生物信息学分析。结果酶切结果显示,插入序列为目标序列,成功克隆了hERGIC3基因;生物信息学分析结果显示:hERGIC3蛋白由383个氨基酸残基组成,相对分子质量为43.2×103,理论等电点为5.68,蛋白比较稳定;hERGIC3为跨膜蛋白,跨膜区为20~42和345~367,膜外区域为1~19和368~383,膜内区域为43~344;二级结构含有110个α螺旋、94条延伸链、184个随机卷曲、15个潜在的磷酸化位点, hERGIC3可能参与了氨基酸、辅酶的生物合成以及脂肪、能量代谢和蛋白质翻译、转运等功能;hERGIC3含有ERGIC_N和COPⅡcoated_ERV 2个蛋白保守结构域,hERGIC3可能与PPKCSH、ERGIC1、ERGIC2、COPA、PSMD11等蛋白有相互作用。结论成功克隆了hERGIC3基因;深入地分析了hERGIC3的结构与功能,为下一步研究hERGIC3基因在肺癌中的病理生理功能奠定了理论基础。%Objective To construct prokaryotic expression vector of hERGIC3 gene and analyze it by bioinformatics ,so as to fully understand the biological function of hERGIC3 protein. Methods The pGEM-T Easy vector was inserted into the DNA fragment of open reading frame(ORF) sequence of gene hERGIC3 by real time-polymerase chain reaction(RT-PCR) method;mul-tiple softwares were selected to analyze the hERGIC3 protein by bioinformatics. Results The enzyme digestion showed that the insertion sequence as the targeted sequence cloned the hERGIC3 gene successfully;the bioinformatics analysis showed that hER-GIC3 protein was consisted of 383 amino acid residues with the relative molecular mass of 43.2 ×103 and theoretical

  20. A bioinformatics approach to marker development

    NARCIS (Netherlands)

    Tang, J.

    2008-01-01

    The thesis focuses on two bioinformatics research topics: the development of tools for an efficient and reliable identification of single nucleotides polymorphisms (SNPs) and polymorphic simple sequence repeats (SSRs) from expressed sequence tags (ESTs) (Chapter 2, 3 and 4), and the subsequent imple

  1. Evolution of web services in bioinformatics

    NARCIS (Netherlands)

    Neerincx, P.B.T.; Leunissen, J.A.M.

    2005-01-01

    Bioinformaticians have developed large collections of tools to make sense of the rapidly growing pool of molecular biological data. Biological systems tend to be complex and in order to understand them, it is often necessary to link many data sets and use more than one tool. Therefore, bioinformatic

  2. Implementing bioinformatic workflows within the bioextract server

    Science.gov (United States)

    Computational workflows in bioinformatics are becoming increasingly important in the achievement of scientific advances. These workflows typically require the integrated use of multiple, distributed data sources and analytic tools. The BioExtract Server (http://bioextract.org) is a distributed servi...

  3. "Extreme Programming" in a Bioinformatics Class

    Science.gov (United States)

    Kelley, Scott; Alger, Christianna; Deutschman, Douglas

    2009-01-01

    The importance of Bioinformatics tools and methodology in modern biological research underscores the need for robust and effective courses at the college level. This paper describes such a course designed on the principles of cooperative learning based on a computer software industry production model called "Extreme Programming" (EP). The…

  4. Bioinformatics: A History of Evolution "In Silico"

    Science.gov (United States)

    Ondrej, Vladan; Dvorak, Petr

    2012-01-01

    Bioinformatics, biological databases, and the worldwide use of computers have accelerated biological research in many fields, such as evolutionary biology. Here, we describe a primer of nucleotide sequence management and the construction of a phylogenetic tree with two examples; the two selected are from completely different groups of organisms:…

  5. Bioinformatics in Undergraduate Education: Practical Examples

    Science.gov (United States)

    Boyle, John A.

    2004-01-01

    Bioinformatics has emerged as an important research tool in recent years. The ability to mine large databases for relevant information has become increasingly central to many different aspects of biochemistry and molecular biology. It is important that undergraduates be introduced to the available information and methodologies. We present a…

  6. Evolutionary Computation Applications in Current Bioinformatics

    OpenAIRE

    Wang, Bing; Zhang, Xiang

    2010-01-01

    This chapter provides an overview of some bioinformatics tasks and the relevance of the evolutionary computation methods, especially GAs. There are two advantages of GA-based approaches. One is that GAs are easier to run in parallel than single trajectory search procedures, and therefore allow groups of processors to be utilized for a search. The other is

  7. Biochemical, Transcriptional, and Bioinformatic Analysis of Lipid Droplets from Seeds of Date Palm (Phoenix dactylifera L.) and Their Use as Potent Sequestration Agents against the Toxic Pollutant, 2,3,7,8-Tetrachlorinated Dibenzo-p-Dioxin

    Science.gov (United States)

    Hanano, Abdulsamie; Almousally, Ibrahem; Shaban, Mouhnad; Rahman, Farzana; Blee, Elizabeth; Murphy, Denis J.

    2016-01-01

    Contamination of aquatic environments with dioxins, the most toxic group of persistent organic pollutants (POPs), is a major ecological issue. Dioxins are highly lipophilic and bioaccumulate in fatty tissues of marine organisms used for seafood where they constitute a potential risk for human health. Lipid droplets (LDs) purified from date palm, Phoenix dactylifera, seeds were characterized and their capacity to extract dioxins from aquatic systems was assessed. The bioaffinity of date palm LDs toward 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD), the most toxic congener of dioxins was determined. Fractioned LDs were spheroidal with mean diameters of 2.5 µm, enclosing an oil-rich core of 392.5 mg mL-1. Isolated LDs did not aggregate and/or coalesce unless placed in acidic media and were strongly associated with three major groups of polypeptides of relative mass 32–37, 20–24, and 16–18 kDa. These masses correspond to the LD-associated proteins, oleosins, caleosins, and steroleosins, respectively. Efficient partitioning of TCDD into LDs occurred with a coefficient of log KLB/w,TCDD = 7.528 ± 0.024; it was optimal at neutral pH and was dependent on the presence of the oil-rich core, but was independent of the presence of LD-associated proteins. Bioinformatic analysis of the date palm genome revealed nine oleosin-like, five caleosin-like, and five steroleosin-like sequences, with predicted structures having putative lipid-binding domains that match their LD stabilizing roles and use as bio-based encapsulation systems. Transcriptomic analysis of date palm seedlings exposed to TCDD showed strong up-regulation of several caleosin and steroleosin genes, consistent with increased LD formation. The results suggest that the plant LDs could be used in ecological remediation strategies to remove POPs from aquatic environments. Recent reports suggest that several fungal and algal species also use LDs to sequester both external and internally derived hydrophobic toxins

  8. 番茄ARF2蛋白的生物信息学分析与亚细胞定位%Bioinformatic Analysis and Subcellular Localization of Solanum lycopersicum ARF2

    Institute of Scientific and Technical Information of China (English)

    冯媛媛; 侯佩; 李颖楠; 刘永胜

    2012-01-01

    克隆番茄(Solanum lycopersicum)ARF2基因,并分析其分子特性和亚细胞定位,为研究其功能提供基础.通过生物信息学方法分析SlARF2基因编码蛋白的理化性质和分子特性.采用RT-PCR技术从番茄果实cDNA中扩增SIARF2基因全长,并构建与黄色荧光蛋白(YFP)融合的pBA-ARF2-YFP表达载体,进而再通过农杆菌介导的遗传转化方法,将重组质粒转化到野生型番茄中,将得到的T1代转基因种子萌发,然后取根尖通过荧光显微镜观察了融合蛋白在活细胞内分布的特点.生物信息学分析结果表明,S1ARF2是富含Ser、Leu、Gly和Pro以及具有ARF家族典型结构域的可溶性蛋白,其氨基酸序列与葡萄、木薯和拟南芥的同源性分别为70.08%、66.94%和60.87%.经酶切和测序分析证实pBA-ARF2-YFP融合表达载体构建成功,此外,PCR分析表明融合蛋白在转基因植株中得到表达.经荧光显微镜观察,ARF2定位在细胞核中.表明转录因子S1ARF2定位在细胞核中,对番茄果实发育和成熟起重要作用.%Auxin response factors (ARFs) are important transcription factors involved in auxin signal transduction pathway. In order to elucidate the function of tomato ARF2, we isolated the SIARF2 gene and analyzed its molecular features, in addition, we observed the subcellular localization of ARF2 in transgenic tomato plants. Physicochemical properties and molecular features of ARF2 were predicted by bioinformatic approaches including physical and chemical properties analysis, hydrophobicity analysis, domain analysis, phylogenetic tree analysis and subcellular localization analysis. Moreover, the full-length of SLARF2 gene was amplified by RT-PCR, and a binary vector consisting of ARF2 fused with the yellow fluorescent protein (YFP) coding sequence was further constructed. Using the method of Agrobacterium-mediated transformation, the recombinant vector was transformed into wild-type tomato, and the transgenic tomato

  9. Component-Based Approach for Educating Students in Bioinformatics

    Science.gov (United States)

    Poe, D.; Venkatraman, N.; Hansen, C.; Singh, G.

    2009-01-01

    There is an increasing need for an effective method of teaching bioinformatics. Increased progress and availability of computer-based tools for educating students have led to the implementation of a computer-based system for teaching bioinformatics as described in this paper. Bioinformatics is a recent, hybrid field of study combining elements of…

  10. Prediction and Bioinformatics Analysis of Human Gene Expression Profiling Regulated by Amifostine%依硫磷酸调控人类基因表达谱的预测及生物信息学分析

    Institute of Scientific and Technical Information of China (English)

    杨波; 脱朝伟; 蔡力力; 迟小华; 卢学春; 张峰; 脱帅; 朱宏丽; 刘丽宏; 严江伟

    2011-01-01

    Objective of this study was to perform bioinformatics analysis of the characteristics of gene expression profiling regulated by amifostine and predict its novel potential biological function to provide a direction for further exploring pharmacological actions of amifostine and study methods. Amifostine was used as a key word to search intemet-based free gene expression database including GEO, affymetrix gene chip database, GenBank, SAGE,GeneCard, InterPro, ProtoNet, UniProt and BLOCKS and the sifted amifostine-regulated gene expression profiling data was subjected to validity testing, gene expression difference analysis and functional clustering and gene annotation. The results showed that only one data of gene expression profiling regulated by amifostine was sifted from GEO database (accession: GSE3212). Through validity testing and gene expression difference analysis, significant difference (p <0.01 ) was only found in 2.14% of the whole genome (460/192000). Gene annotation analysis showed that 139 out of 460 genes were known genes, in which 77 genes were up-regulated and 62 genes were down-regulated. 13 out of 139 genes were newly expressed following amifostine treatment of K562 cells, however expression of 5 genes was completely inhibited. Functional clustering displayed that 139 genes were divided into 1 l categories and their biological function was involved in hematopoietic and immunologic regulation, apoptosis and cell cycle. It is concluded that bioinformatics method can be applied to analysis of gene expression profiling regulated by amifostine. Amifostine has a regulatory effect on human gene expression profiling and this action is mainly presented in biological processes including hematopoiesis,immunologic regulation, apoptosis and cell cycle and so on. The effect of amifostine on human gene expression need to be further testified in experimental condition.%本研究对依硫磷酸调控人类基因表达谱进行生物信息学分析,预测其可

  11. Comparative QTL mapping of resistance to sugarcane mosaic virus in maize based on bioinformatics

    Institute of Scientific and Technical Information of China (English)

    Xiangling L(U); Xinhai LI; Chuanxiao XIE; Zhuanfang HAO; Hailian JI; Liyu SHI; Shihuang ZHANG

    2008-01-01

    The development of genomics and bioinfor-matics offers new tools for comparative gene mapping. In this paper, an integrated QTL map for sugarcane mosaic virus (SCMV) resistance in maize was constructed by compiling a total of 81 QTL loci available, using the Genetic Map IBM2 2005 Neighbors as reference. These 81 QTL loci were scattered on 7 chromosomes of maize, and most of them were clustered on chromosomes 3 and 6. By using the method of meta-analysis, we identified one "consensus QTL" on chromosome 3 covering a genetic distance of 6.44 cM, and two on chromosome 6 covering genetic distances of 16 cM and 27.48 cM, respectively. Four positional candidate resistant genes were identified within the "consensus QTL" on chromosome 3 via the strategy of comparative genomics. These results suggest that application of a combination of meta-analysis within a species with sequence homology comparison in a related model plant is an efficient approach to identify the major QTL and its candidate gene(s) for the target traits. The results of this study provide useful information for iden-tifying and cloning the major gene(s) conferring resistance to SCMV in maize.

  12. A Numerical Procedure for Model Identifiability Analysis Applied to Enzyme Kinetics

    DEFF Research Database (Denmark)

    Daele, Timothy, Van; Van Hoey, Stijn; Gernaey, Krist;

    2015-01-01

    structure evaluation by assessing the local identifiability characteristics of the parameters. Moreover, such a procedure should be generic to make sure it can be applied independent from the structure of the model. We hereby apply a numerical identifiability approach which is based on the work of Walter...... and Pronzato (1997) and which can be easily set up for any type of model. In this paper the proposed approach is applied to the forward reaction rate of the enzyme kinetics proposed by Shin and Kim(1998). Structural identifiability analysis showed that no local structural model problems were occurring....... In contrast, the practical identifiability analysis revealed that high values of the forward rate parameter Vf led to identifiability problems. These problems were even more pronounced athigher substrate concentrations, which illustrates the importance of a proper experimental designto avoid...

  13. 猪ATGL基因5'调控区的SNPs检测及生物信息学分析%SNPs Detection and Bioinformatics Analysis on 5'Regulatory Region of the Porcine ATGL Gene

    Institute of Scientific and Technical Information of China (English)

    华绪川; 张立凡; 蒋晓玲; 翟继鹏; 徐宁迎; 张金枝

    2011-01-01

    脂肪甘油三酯水解酶(ATGL)是脂肪组织脂肪动员过程中的水解限速酶,主要催化甘油三酯水解为甘油二酯.研究对金华猪、岔路黑猪、杜洛克、大约克和皮特兰5个猪种ATGL基因其5'调控区1.2 kb的片段进行SNPs检测和生物信息学分析.结果表明:ATGI基因5'调控区存在第-845位G→C和第-854位T→C的连锁突变.序列分析显示该区域可能存在启动子区,且2个突变都会导致其部分潜在转录因子结合位点的产生或消失.采用PCR-RFLP方法检测g-845G→C座位在金华猪、岔路黑猪、杜洛克、大约克和皮特兰中的分布情况,卡方分析结果显示,3种基因型在5个猪种中的分布存在极显著差异(P<0.01),提示不同猪种间脂肪性状的差异可能与ATGL基因5'调控区的基因突变有关.%As a key enzyme in the initial step of triglyceride hydrolysis, adipose triglyceride lipase (ATGL) plays a critical role in the lipolytic catabolism of stored fat in adipose tissue. 1.2 kb of the 5' flanking region of the porcine A TGL gene was sequenced in this study and two completely linked mutations, g-845G→C and g-854T→C, were found in the region. Results of the bioinformatics analysis indicated the presence of promoter sequence and mutations in loci g -845G→C and g-854T→C could create or destroy potential transcription factor binding sites. Locus g-845G→C were genotyped in Jinhua, Chalu black, Large Yorkshire, Duroc and Pietrain pig breeds by PCR-RFLP, and the results showed that the distribution of three genotypes was significantly different among breeds (P<0.01), which suggested that the g-845G→C mutation may contribute to diversity of fat traits in different pig breeds.

  14. Analyses of Brucella pathogenesis, host immunity, and vaccine targets using systems biology and bioinformatics

    Directory of Open Access Journals (Sweden)

    Yongqun eHe

    2012-02-01

    Full Text Available Brucella is a Gram-negative, facultative intracellular bacterium that causes zoonotic brucellosis in humans and various animals. Out of ten classified Brucella species, B. melitensis, B. abortus, B. suis, and B. canis are pathogenic to humans. In the past decade, the mechanisms of Brucella pathogenesis and host immunity have been extensively investigated using the cutting edge systems biology and bioinformatics approaches. This article provides a comprehensive review of the applications of Omics (including genomics, transcriptomics, and proteomics and bioinformatics technologies for the analysis of Brucella pathogenesis, host immune responses, and vaccine targets. Based on more than 30 sequenced Brucella genomes, comparative genomics is able to identify gene variations among Brucella strains that help to explain host specificity and virulence differences among Brucella species. Diverse transcriptomics and proteomics gene expression studies have been conducted to analyze gene expression profiles of wild type Brucella strains and mutants under different laboratory conditions. High throughput Omics analyses of host responses to infections with virulent or attenuated Brucella strains have been focused on responses by mouse and cattle macrophages, bovine trophoblastic cells, mouse and boar splenocytes, and ram buffy coat. Differential serum responses in humans and rams to Brucella infections have been analyzed using high throughput serum antibody screening technology. The Vaxign reverse vaccinology has been used to predict many Brucella vaccine targets. More than 180 Brucella virulence factors and their gene interaction networks have been identified using advanced literature mining methods. The recent development of community-based Vaccine Ontology and Brucellosis Ontology provides an efficient way for Brucella data integration, exchange, and computer-assisted automated reasoning.

  15. Consolidating metabolite identifiers to enable contextual and multi-platform metabolomics data analysis

    Directory of Open Access Journals (Sweden)

    Saito Kazuki

    2010-04-01

    Full Text Available Abstract Background Analysis of data from high-throughput experiments depends on the availability of well-structured data that describe the assayed biomolecules. Procedures for obtaining and organizing such meta-data on genes, transcripts and proteins have been streamlined in many data analysis packages, but are still lacking for metabolites. Chemical identifiers are notoriously incoherent, encompassing a wide range of different referencing schemes with varying scope and coverage. Online chemical databases use multiple types of identifiers in parallel but lack a common primary key for reliable database consolidation. Connecting identifiers of analytes found in experimental data with the identifiers of their parent metabolites in public databases can therefore be very laborious. Results Here we present a strategy and a software tool for integrating metabolite identifiers from local reference libraries and public databases that do not depend on a single common primary identifier. The program constructs groups of interconnected identifiers of analytes and metabolites to obtain a local metabolite-centric SQLite database. The created database can be used to map in-house identifiers and synonyms to external resources such as the KEGG database. New identifiers can be imported and directly integrated with existing data. Queries can be performed in a flexible way, both from the command line and from the statistical programming environment R, to obtain data set tailored identifier mappings. Conclusions Efficient cross-referencing of metabolite identifiers is a key technology for metabolomics data analysis. We provide a practical and flexible solution to this task and an open-source program, the metabolite masking tool (MetMask, available at http://metmask.sourceforge.net, that implements our ideas.

  16. Proteomic Analysis to Identify Tightly-Bound Cell Wall Protein in Rice Calli

    OpenAIRE

    Cho, Won Kyong; Hyun, Tae Kyung; Kumar, Dhinesh; Rim, Yeonggil; Chen, Xiong Yan; Jo, Yeonhwa; Kim, Suwha; Lee, Keun Woo; Park, Zee-Yong; Lucas, William J.; Kim, Jae-Yean

    2015-01-01

    Rice is a model plant widely used for basic and applied research programs. Plant cell wall proteins play key roles in a broad range of biological processes. However, presently, knowledge on the rice cell wall proteome is rudimentary in nature. In the present study, the tightly-bound cell wall proteome of rice callus cultured cells using sequential extraction protocols was developed using mass spectrometry and bioinformatics methods, leading to the identification of 1568 candidate proteins. Ba...

  17. An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics

    International Nuclear Information System (INIS)

    Bioinformatics researchers are increasingly confronted with analysis of ultra large-scale data sets, a problem that will only increase at an alarming rate in coming years. Recent developments in open source software, that is, the Hadoop project and associated software, provide a foundation for scaling to petabyte scale data warehouses on Linux clusters, providing fault-tolerant parallelized analysis on such data using a programming style named MapReduce. An overview is given of the current usage within the bioinformatics community of Hadoop, a top-level Apache Software Foundation project, and of associated open source software projects. The concepts behind Hadoop and the associated HBase project are defined, and current bioinformatics software that employ Hadoop is described. The focus is on next-generation sequencing, as the leading application area to date.

  18. Bioinformatics for saffron (Crocus sativus L. improvement

    Directory of Open Access Journals (Sweden)

    Ghulam A. Parray

    2009-02-01

    Full Text Available Saffron (Crocus sativus L. is a sterile triploid plant and belongs to the Iridaceae (Liliales, Monocots. Its genome is of relatively large size and is poorly characterized. Bioinformatics can play an enormous technical role in the sequence-level structural characterization of saffron genomic DNA. Bioinformatics tools can also help in appreciating the extent of diversity of various geographic or genetic groups of cultivated saffron to infer relationships between groups and accessions. The characterization of the transcriptome of saffron stigmas is the most vital for throwing light on the molecular basis of flavor, color biogenesis, genomic organization and biology of gynoecium of saffron. The information derived can be utilized for constructing biological pathways involved in the biosynthesis of principal components of saffron i.e., crocin, crocetin, safranal, picrocrocin and safchiA

  19. Discovery and Classification of Bioinformatics Web Services

    Energy Technology Data Exchange (ETDEWEB)

    Rocco, D; Critchlow, T

    2002-09-02

    The transition of the World Wide Web from a paradigm of static Web pages to one of dynamic Web services provides new and exciting opportunities for bioinformatics with respect to data dissemination, transformation, and integration. However, the rapid growth of bioinformatics services, coupled with non-standardized interfaces, diminish the potential that these Web services offer. To face this challenge, we examine the notion of a Web service class that defines the functionality provided by a collection of interfaces. These descriptions are an integral part of a larger framework that can be used to discover, classify, and wrapWeb services automatically. We discuss how this framework can be used in the context of the proliferation of sites offering BLAST sequence alignment services for specialized data sets.

  20. A Novel Approach for Bioinformatics Workflow Discovery

    Directory of Open Access Journals (Sweden)

    Walaa Nagy

    2014-11-01

    Full Text Available Workflow systems are typical fit for in the explorative research of bioinformaticians. These systems can help bioinformaticians to design and run their experiments and to automatically capture and store the data generated at runtime. On the other hand, Web services are increasingly used as the preferred method for accessing and processing the information coming from the diverse life science sources. In this work we provide an efficient approach for creating bioinformatic workflow for all-service architecture systems (i.e., all system components are services . This architecture style simplifies the user interaction with workflow systems and facilitates both the change of individual components, and the addition of new components to adopt to other workflow tasks if required. We finally present a case study for the bioinformatics domain to elaborate the applicability of our proposed approach.

  1. Chapter 16: Text Mining for Translational Bioinformatics

    OpenAIRE

    Bretonnel Cohen, K; Hunter, Lawrence E.

    2013-01-01

    Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research-translating basic science results into new interventions-and T2 translational research, or translational research for public health. P...

  2. Genome bioinformatics of tomato and potato

    OpenAIRE

    Datema, E.

    2011-01-01

    In the past two decades genome sequencing has developed from a laborious and costly technology employed by large international consortia to a widely used, automated and affordable tool used worldwide by many individual research groups. Genome sequences of many food animals and crop plants have been deciphered and are being exploited for fundamental research and applied to improve their breeding programs. The developments in sequencing technologies have also impacted the associated bioinformat...

  3. Hardware Acceleration of Bioinformatics Sequence Alignment Applications

    OpenAIRE

    Hasan, L.

    2011-01-01

    Biological sequence alignment is an important and challenging task in bioinformatics. Alignment may be defined as an arrangement of two or more DNA or protein sequences to highlight the regions of their similarity. Sequence alignment is used to infer the evolutionary relationship between a set of protein or DNA sequences. An accurate alignment can provide valuable information for experimentation on the newly found sequences. It is indispensable in basic research as well as in practical applic...

  4. A Novel Approach for Bioinformatics Workflow Discovery

    OpenAIRE

    Walaa Nagy; Hoda M.O. Mokhtar

    2014-01-01

    Workflow systems are typical fit for in the explorative research of bioinformaticians. These systems can help bioinformaticians to design and run their experiments and to automatically capture and store the data generated at runtime. On the other hand, Web services are increasingly used as the preferred method for accessing and processing the information coming from the diverse life science sources. In this work we provide an efficient approach for creating bioinformatic workflow for all-serv...

  5. Management and Marketing of Bioinformatics Tools

    OpenAIRE

    Sudhakar, R.; Gupta, Kuhu; Kumar, Sushant

    2013-01-01

    Bioinformatics can be defined as conceptualization of biology, in specific- Molecular Biology and then application of certain techniques from multiple disciplines such as statistics, computer science and applied mathematics to analyze and understand the vast information related to molecular structures. Hence, its management becomes difficult. The manager must be thorough with the concepts of biology- genetic studies in particular, as well as information technology. Discussed below is the mana...

  6. Associations between Input and Outcome Variables in an Online High School Bioinformatics Instructional Program

    Science.gov (United States)

    Lownsbery, Douglas S.

    Quantitative data from a completed year of an innovative online high school bioinformatics instructional program were analyzed as part of a descriptive research study. The online instructional program provided the opportunity for high school students to develop content understandings of molecular genetics and to use sophisticated bioinformatics tools and methodologies to conduct authentic research. Quantitative data were analyzed to identify potential associations between independent program variables including implementation setting, gender, and student educational backgrounds and dependent variables indicating success in the program including completion rates for analyzing DNA clones and performance gains from pre-to-post assessments of bioinformatics knowledge. Study results indicate that understanding associations between student educational backgrounds and level of success may be useful for structuring collaborative learning groups and enhancing scaffolding and support during the program to promote higher levels of success for participating students.

  7. Bioinformatics-Driven Identification and Examination of Candidate Genes for Non-Alcoholic Fatty Liver Disease

    DEFF Research Database (Denmark)

    Banasik, Karina; Justesen, Johanne M.; Hornbak, Malene;

    2011-01-01

    Objective: Candidate genes for non-alcoholic fatty liver disease (NAFLD) identified by a bioinformatics approach were examined for variant associations to quantitative traits of NAFLD-related phenotypes. Research Design and Methods: By integrating public database text mining, trans-organism protein...

  8. Why Choose This One? Factors in Scientists' Selection of Bioinformatics Tools

    Science.gov (United States)

    Bartlett, Joan C.; Ishimura, Yusuke; Kloda, Lorie A.

    2011-01-01

    Purpose: The objective was to identify and understand the factors involved in scientists' selection of preferred bioinformatics tools, such as databases of gene or protein sequence information (e.g., GenBank) or programs that manipulate and analyse biological data (e.g., BLAST). Methods: Eight scientists maintained research diaries for a two-week…

  9. Bioinformatics on the cloud computing platform Azure.

    Science.gov (United States)

    Shanahan, Hugh P; Owen, Anne M; Harrison, Andrew P

    2014-01-01

    We discuss the applicability of the Microsoft cloud computing platform, Azure, for bioinformatics. We focus on the usability of the resource rather than its performance. We provide an example of how R can be used on Azure to analyse a large amount of microarray expression data deposited at the public database ArrayExpress. We provide a walk through to demonstrate explicitly how Azure can be used to perform these analyses in Appendix S1 and we offer a comparison with a local computation. We note that the use of the Platform as a Service (PaaS) offering of Azure can represent a steep learning curve for bioinformatics developers who will usually have a Linux and scripting language background. On the other hand, the presence of an additional set of libraries makes it easier to deploy software in a parallel (scalable) fashion and explicitly manage such a production run with only a few hundred lines of code, most of which can be incorporated from a template. We propose that this environment is best suited for running stable bioinformatics software by users not involved with its development. PMID:25050811

  10. Bioinformatic Comparison of Bacterial Secretomes

    Institute of Scientific and Technical Information of China (English)

    Catharine Song; Aseem Kumar; Mazen Saleh

    2009-01-01

    The rapid increasing number of completed bacterial genomes provides a good op-portunity to compare their proteomes. This study was undertaken to specifically compare and contrast their secretomes-the fraction of the proteome with pre-dicted N-terminal signal sequences, both type Ⅰ and type Ⅱ. A total of 176 theoreti-cal bacterial proteomes were examined using the ExProt program. Compared with the Gram-positives, the Gram-negative bacteria were found, on average, to con-tain a larger number of potential Sec-dependent sequences. In the Gram-negative bacteria but not in the others, there was a positive correlation between proteome size and secretome size, while there was no correlation between secretome size and pathogenicity. Within the Gram-negative bacteria, intracellular pathogens were found to have the smallest secretomes. However, the secretomes of certain bacte-ria did not fit into the observed pattern. Specifically, the secretome of Borrelia burgdoferi has an unusually large number of putative lipoproteins, and the signal peptides of mycoplasmas show closer sequence similarity to those of the Gram-negative bacteria. Our analysis also suggests that even for a theoretical minimal genome of 300 open reading frames, a fraction of this gene pool (up to a maximum of 20%) may code for proteins with Sec-dependent signal sequences.

  11. H3ABioNet, a sustainable pan-African bioinformatics network for human heredity and health in Africa.

    Science.gov (United States)

    Mulder, Nicola J; Adebiyi, Ezekiel; Alami, Raouf; Benkahla, Alia; Brandful, James; Doumbia, Seydou; Everett, Dean; Fadlelmola, Faisal M; Gaboun, Fatima; Gaseitsiwe, Simani; Ghazal, Hassan; Hazelhurst, Scott; Hide, Winston; Ibrahimi, Azeddine; Jaufeerally Fakim, Yasmina; Jongeneel, C Victor; Joubert, Fourie; Kassim, Samar; Kayondo, Jonathan; Kumuthini, Judit; Lyantagaye, Sylvester; Makani, Julie; Mansour Alzohairy, Ahmed; Masiga, Daniel; Moussa, Ahmed; Nash, Oyekanmi; Ouwe Missi Oukem-Boyer, Odile; Owusu-Dabo, Ellis; Panji, Sumir; Patterton, Hugh; Radouani, Fouzia; Sadki, Khalid; Seghrouchni, Fouad; Tastan Bishop, Özlem; Tiffin, Nicki; Ulenga, Nzovu; Adebiyi, Marion; Ahmed, Azza E; Ahmed, Rehab I; Alearts, Maaike; Alibi, Mohamed; Aron, Shaun; Baichoo, Shakuntala; Bendou, Hocine; Botha, Gerrit; Brown, David; Chimusa, Emile; Christoffels, Alan; Cornick, Jennifer; Entfellner, Jean-Baka Domelevo; Fields, Chris; Fischer, Anne; Gamieldien, Junaid; Ghedira, Kais; Ghouila, Amel; Ho Sui, Shannan; Isewon, Itunuoluwa; Isokpehi, Raphael; Dashti, Mahjoubeh Jalali Sefid; Kamng'ona, Arox; Khetani, Radhika S; Kiran, Anmol; Kulohoma, Benard; Kumwenda, Benjamin; Lapine, Dan; Mainzer, Liudmila Sergeevna; Maslamoney, Suresh; Mbiyavanga, Mamana; Meintjes, Ayton; Mlyango, Flora Elias; Mmbando, Bruno; Mohammed, Somia A; Mpangase, Phelelani; Msefula, Chisomo; Mtatiro, Siana Nkya; Mugutso, Dunfunk; Mungloo-Dilmohammud, Zahra; Musicha, Patrick; Nembaware, Victoria; Osamor, Victor Chukwudi; Oyelade, Jelili; Rendon, Gloria; Salazar, Gustavo A; Salifu, Samson Pandam; Sangeda, Raphael; Souiai, Oussema; Van Heusden, Peter; Wele, Mamadou

    2016-02-01

    The application of genomics technologies to medicine and biomedical research is increasing in popularity, made possible by new high-throughput genotyping and sequencing technologies and improved data analysis capabilities. Some of the greatest genetic diversity among humans, animals, plants, and microbiota occurs in Africa, yet genomic research outputs from the continent are limited. The Human Heredity and Health in Africa (H3Africa) initiative was established to drive the development of genomic research for human health in Africa, and through recognition of the critical role of bioinformatics in this process, spurred the establishment of H3ABioNet, a pan-African bioinformatics network for H3Africa. The limitations in bioinformatics capacity on the continent have been a major contributory factor to the lack of notable outputs in high-throughput biology research. Although pockets of high-quality bioinformatics teams have existed previously, the majority of research institutions lack experienced faculty who can train and supervise bioinformatics students. H3ABioNet aims to address this dire need, specifically in the area of human genetics and genomics, but knock-on effects are ensuring this extends to other areas of bioinformatics. Here, we describe the emergence of genomics research and the development of bioinformatics in Africa through H3ABioNet. PMID:26627985

  12. H3ABioNet, a sustainable pan-African bioinformatics network for human heredity and health in Africa

    Science.gov (United States)

    Mulder, Nicola J.; Adebiyi, Ezekiel; Alami, Raouf; Benkahla, Alia; Brandful, James; Doumbia, Seydou; Everett, Dean; Fadlelmola, Faisal M.; Gaboun, Fatima; Gaseitsiwe, Simani; Ghazal, Hassan; Hazelhurst, Scott; Hide, Winston; Ibrahimi, Azeddine; Jaufeerally Fakim, Yasmina; Jongeneel, C. Victor; Joubert, Fourie; Kassim, Samar; Kayondo, Jonathan; Kumuthini, Judit; Lyantagaye, Sylvester; Makani, Julie; Mansour Alzohairy, Ahmed; Masiga, Daniel; Moussa, Ahmed; Nash, Oyekanmi; Ouwe Missi Oukem-Boyer, Odile; Owusu-Dabo, Ellis; Panji, Sumir; Patterton, Hugh; Radouani, Fouzia; Sadki, Khalid; Seghrouchni, Fouad; Tastan Bishop, Özlem; Tiffin, Nicki; Ulenga, Nzovu

    2016-01-01

    The application of genomics technologies to medicine and biomedical research is increasing in popularity, made possible by new high-throughput genotyping and sequencing technologies and improved data analysis capabilities. Some of the greatest genetic diversity among humans, animals, plants, and microbiota occurs in Africa, yet genomic research outputs from the continent are limited. The Human Heredity and Health in Africa (H3Africa) initiative was established to drive the development of genomic research for human health in Africa, and through recognition of the critical role of bioinformatics in this process, spurred the establishment of H3ABioNet, a pan-African bioinformatics network for H3Africa. The limitations in bioinformatics capacity on the continent have been a major contributory factor to the lack of notable outputs in high-throughput biology research. Although pockets of high-quality bioinformatics teams have existed previously, the majority of research institutions lack experienced faculty who can train and supervise bioinformatics students. H3ABioNet aims to address this dire need, specifically in the area of human genetics and genomics, but knock-on effects are ensuring this extends to other areas of bioinformatics. Here, we describe the emergence of genomics research and the development of bioinformatics in Africa through H3ABioNet. PMID:26627985

  13. 水稻OsLEA19a基因的克隆、表达及生物信息学分析%Molecular Cloning, Expression Character and Bioinformatics Analysis of OsLEA19a from Rice

    Institute of Scientific and Technical Information of China (English)

    胡廷章; 吴应梅; 陈再刚; 黄小云

    2011-01-01

    Semi-quantitative RT-PCR analysis revealed that OsLEA19a,a. Late embryogenesis abundant( LEA) protein gene,was induced by water deficit and salt stress,which suggested a role for 0sLEA19a protein in water deficit and salt stress protection. The cDNA sequence of 0sLEA19a was cloned from rice by RT-PCR. Bioinformatics a-nalysis showed that OsLEA19a encoded a protein of 200 amino acids with a calculated molecular mass of 20. 48 kDa and a theoretical pi of 5. 89. 0sLEA19a protein contains a Pfam:LEA_4 domain architecture at position 5-48, three α-helical domains and without β-sheet domain. In silico predictions showed OsLEA19a protein was strongly hy-drophilic. The phylogenetic relationship between related group 3 LEA proteins from different plants was analyzed, which showed that OsLEA19a was more closely related to group 3 LEA proteins from monocots than from dicots. The amino acid sequence of OsLEA19a shows 48% -57% sequence identity with other members of group 3 LEA proteins from monocots.%半定量RT-PCR分析表明,水稻胚胎发育后期丰富蛋白基因OsLEA19a在水稻幼苗中的表达受干旱和高盐的诱导,说明OsLEA19a可能在水稻抗旱和抗盐中发挥作用.利用RT-PCR方法成功从水稻中克隆了OsLEA19a的cDNA.生物信息学分析表明,OsLEA19a基因编码一个由200个氨基酸残基组成的蛋白,蛋白分子量为20.48 kDa,等电点为5.89.OsLEA19a蛋白的5~48位氨基酸残基形成LEA_4结构域.OsLEA19a蛋白的二级结构有3个α-螺旋构象区域,没有伸展的β-片层构象,为亲水蛋白.进化树分析表明,OsLEA19a与单子叶植物第3组LEA蛋白的亲缘关系较近,而与双子叶植物的较远,OsLEA19a与单子叶植物第3组LEA蛋白的氨基酸一致性为48%~ 57%.

  14. Identifying Innovative Interventions to Promote Healthy Eating Using Consumption-Oriented Food Supply Chain Analysis

    OpenAIRE

    Hawkes, Corinna, ed.

    2009-01-01

    The mapping and analysis of supply chains is a technique increasingly used to address problems in the food system. Yet such supply chain management has not yet been applied as a means of encouraging healthier diets. Moreover, most policies recommended to promote healthy eating focus on the consumer end of the chain. This article proposes a consumption-oriented food supply chain analysis to identify the changes needed in the food supply chain to create a healthier food environment, measured in...

  15. 猪带绦虫成虫凝溶胶蛋白基因的生物信息学分析%Bioinformatics Analysis on Gelsolin Gene of Taenia Solium Imago

    Institute of Scientific and Technical Information of China (English)

    申萍香; 王宇; 黄江

    2011-01-01

    Objective: To analyze the structure of gelsolin gene from Taenia solium adults ( Ts GEL)and the structure and function of the encoded protein. Methods: Gelsolin gene was identified from T.solium full-length cDNA plasmid libratory by analyzing tools. Its structure was analyzed, and the structure and function characteristics of the encoded protein were predicted. Results: Consistency and similarity of Ts GEL with Acephalocystis granulosus gelsolin gene were 88% and 93% respectively. The full length of Ts GEL was 1 514 bp. Its coding region was 167 ~ 1 263, encoding 365 amino acids. The encoded protein didnt contain any kinds of subcellular localization sequence, but contained several potential phosphorylation sites. It was stable in solution. Conclusions: The full cDNA sequence of gelsolin gene is screened from cDNA plasmid library of T. solium imago with bioinformatical method. The structure and function of the gene and encoded protein are predicted.[ Key words%目的:分析猪带绦虫(Taeria solium)成虫凝溶胶蛋白(gelsolin,GEL)基因及编码蛋白的结构和功能.方法:利用生物信息网站美国国家生物技术信息中心(NCBI)和瑞士生物信息学研究所的蛋白分析专家系统(ExPASY)中生物信息学分析工具,并结合其它分析软件,从获得的猪带绦虫成虫全长cDNA质粒文库的表达序列标签(EST)中识别凝溶胶蛋白基因,分析该基因的结构并预测其编码蛋白质的结构和功能特征.结果:猪带绦虫成虫凝溶胶蛋白基因与细粒棘球绦虫凝溶胶蛋白的一致性为88%,相似性为93%;全长1 514 bp,编码区为167~1 263bp,编码365个氨基酸,无各种亚细胞定位序列,具有多个潜在的磷酸化位点,蛋白在溶液中性质稳定.结论:应用生物信息方法从猪带绦虫成虫cDNA质粒文库中筛选出了猪带绦虫凝溶胶蛋白cDNA全长序列并预测其结构与功能.

  16. The Bioinformatics of Integrative Medical Insights: Proposals for an International PsychoSocial and Cultural Bioinformatics Project

    Directory of Open Access Journals (Sweden)

    Ernest Rossi

    2006-01-01

    Full Text Available We propose the formation of an International PsychoSocial and Cultural Bioinformatics Project (IPCBP to explore the research foundations of Integrative Medical Insights (IMI on all levels from the molecular-genomic to the psychological, cultural, social, and spiritual. Just as The Human Genome Project identified the molecular foundations of modern medicine with the new technology of sequencing DNA during the past decade, the IPCBP would extend and integrate this neuroscience knowledge base with the technology of gene expression via DNA/proteomic microarray research and brain imaging in development, stress, healing, rehabilitation, and the psychotherapeutic facilitation of existentional wellness. We anticipate that the IPCBP will require a unique international collaboration of, academic institutions, researchers, and clinical practioners for the creation of a new neuroscience of mind-body communication, brain plasticity, memory, learning, and creative processing during optimal experiential states of art, beauty, and truth. We illustrate this emerging integration of bioinformatics with medicine with a videotape of the classical 4-stage creative process in a neuroscience approach to psychotherapy.

  17. Identifying Effective Spelling Interventions Using a Brief Experimental Analysis and Extended Analysis

    Science.gov (United States)

    McCurdy, Merilee; Clure, Lynne F.; Bleck, Amanda A.; Schmitz, Stephanie L.

    2016-01-01

    Spelling is an important skill that is crucial to effective written communication. In this study, brief experimental analysis procedures were used to examine spelling instruction strategies (e.g., whole word correction; word study strategy; positive practice; and cover, copy, and compare) for four students. In addition, an extended analysis was…

  18. Web services at the European Bioinformatics Institute-2009.

    Science.gov (United States)

    McWilliam, Hamish; Valentin, Franck; Goujon, Mickael; Li, Weizhong; Narayanasamy, Menaka; Martin, Jenny; Miyar, Teresa; Lopez, Rodrigo

    2009-07-01

    The European Bioinformatics Institute (EMBL-EBI) has been providing access to mainstream databases and tools in bioinformatics since 1997. In addition to the traditional web form based interfaces, APIs exist for core data resources such as EMBL-Bank, Ensembl, UniProt, InterPro, PDB and ArrayExpress. These APIs are based on Web Services (SOAP/REST) interfaces that allow users to systematically access databases and analytical tools. From the user's point of view, these Web Services provide the same functionality as the browser-based forms. However, using the APIs frees the user from web page constraints and are ideal for the analysis of large batches of data, performing text-mining tasks and the casual or systematic evaluation of mathematical models in regulatory networks. Furthermore, these services are widespread and easy to use; require no prior knowledge of the technology and no more than basic experience in programming. In the following we wish to inform of new and updated services as well as briefly describe planned developments to be made available during the course of 2009-2010. PMID:19435877

  19. Using Latent Class Analysis to Identify Academic and Behavioral Risk Status in Elementary Students

    Science.gov (United States)

    King, Kathleen R.; Lembke, Erica S.; Reinke, Wendy M.

    2016-01-01

    Identifying classes of children on the basis of academic and behavior risk may have important implications for the allocation of intervention resources within Response to Intervention (RTI) and Multi-Tiered System of Support (MTSS) models. Latent class analysis (LCA) was conducted with a sample of 517 third grade students. Fall screening scores in…

  20. Identifying Skill Requirements for GIS Positions: A Content Analysis of Job Advertisements

    Science.gov (United States)

    Hong, Jung Eun

    2016-01-01

    This study identifies the skill requirements for geographic information system (GIS) positions, including GIS analysts, programmers/developers/engineers, specialists, and technicians, through a content analysis of 946 GIS job advertisements from 2007-2014. The results indicated that GIS job applicants need to possess high levels of GIS analysis…

  1. Twelve type 2 diabetes susceptibility loci identified through large-scale association analysis

    DEFF Research Database (Denmark)

    Voight, Benjamin F; Scott, Laura J; Steinthorsdottir, Valgerdur;

    2010-01-01

    By combining genome-wide association data from 8,130 individuals with type 2 diabetes (T2D) and 38,987 controls of European descent and following up previously unidentified meta-analysis signals in a further 34,412 cases and 59,925 controls, we identified 12 new T2D association signals with combi...

  2. Identifying sustainability issues using participatory SWOT analysis - A case study of egg production in the Netherlands

    NARCIS (Netherlands)

    Mollenhorst, H.; Boer, de I.J.M.

    2004-01-01

    The aim of this paper was to demonstrate how participatory strengths, weaknesses, opportunities and threats (SWOT) analysis can be used to identify relevant economic, ecological and societal (EES) issues for the assessment of sustainable development. This is illustrated by the case of egg production

  3. Application of Genome-Wide Expression Analysis To Identify Molecular Markers Useful in Monitoring Industrial Fermentations

    OpenAIRE

    Higgins, Vincent J.; Rogers, Peter J.; Dawes, Ian W.

    2003-01-01

    Genome-wide expression analysis of an industrial strain of Saccharomyces cerevisiae identified the YOR387c and YGL258w homologues as highly inducible in zinc-depleted conditions. Induction was specific for zinc deficiency and was dependent on Zap1p. The results indicate that these sequences may be valuable molecular markers for detecting zinc deficiency in industrial fermentations.

  4. Genome-wide association scan meta-analysis identifies three loci influencing adiposity and fat distribution

    NARCIS (Netherlands)

    C.M. Lindgren (Cecilia); I.M. Heid (Iris); J.C. Randall (Joshua); C. Lamina (Claudia); V. Steinthorsdottir (Valgerdur); L. Qi (Lu); E.K. Speliotes (Elizabeth); G. Thorleifsson (Gudmar); C.J. Willer (Cristen); B.M. Herrera (Blanca); A.U. Jackson (Anne); N. Lim (Noha); P. Scheet (Paul); N. Soranzo (Nicole); N. Amin (Najaf); Y.S. Aulchenko (Yurii); J.C. Chambers (John); A. Drong (Alexander); J. Luan; H.N. Lyon (Helen); F. Rivadeneira Ramirez (Fernando); S. Sanna (Serena); N. Timpson (Nicholas); M.C. Zillikens (Carola); H.Z. Jing; P. Almgren (Peter); S. Bandinelli (Stefania); A.J. Bennett (Amanda); R.N. Bergman (Richard); L.L. Bonnycastle (Lori); S. Bumpstead (Suzannah); S.J. Chanock (Stephen); L. Cherkas (Lynn); P.S. Chines (Peter); L. Coin (Lachlan); C. Cooper (Charles); G. Crawford (Gabe); A. Doering (Angela); A. Dominiczak (Anna); A.S.F. Doney (Alex); S. Ebrahim (Shanil); P. Elliott (Paul); M.R. Erdos (Michael); K. Estrada Gil (Karol); L. Ferrucci (Luigi); G. Fischer (Guido); N.G. Forouhi (Nita); C. Gieger (Christian); H. Grallert (Harald); C.J. Groves (Christopher); S.M. Grundy (Scott); C. Guiducci (Candace); D. Hadley (David); A. Hamsten (Anders); A.S. Havulinna (Aki); A. Hofman (Albert); R. Holle (Rolf); J.W. Holloway (John); T. Illig (Thomas); B. Isomaa (Bo); L.C. Jacobs (Leonie); K. Jameson (Karen); P. Jousilahti (Pekka); F. Karpe (Fredrik); J. Kuusisto (Johanna); J. Laitinen (Jaana); G.M. Lathrop (Mark); D.A. Lawlor (Debbie); M. Mangino (Massimo); W.L. McArdle (Wendy); T. Meitinger (Thomas); M.A. Morken (Mario); A.P. Morris (Andrew); P. Munroe (Patricia); N. Narisu (Narisu); A. Nordström (Anna); B.A. Oostra (Ben); C.N.A. Palmer (Colin); F. Payne (Felicity); J. Peden (John); I. Prokopenko (Inga); F. Renström (Frida); A. Ruokonen (Aimo); V. Salomaa (Veikko); M.S. Sandhu (Manjinder); L.J. Scott (Laura); A. Scuteri (Angelo); K. Silander (Kaisa); K. Song (Kijoung); X. Yuan (Xin); H.M. Stringham (Heather); A.J. Swift (Amy); T. Tuomi (Tiinamaija); M. Uda (Manuela); P. Vollenweider (Peter); G. Waeber (Gérard); C. Wallace (Chris); G.B. Walters (Bragi); M.N. Weedon (Michael); J.C.M. Witteman (Jacqueline); C. Zhang (Cuilin); M. Caulfield (Mark); F.S. Collins (Francis); G.D. Smith; I.N.M. Day (Ian); P.W. Franks (Paul); A.T. Hattersley (Andrew); F.B. Hu (Frank); M.R. Jarvelin; A. Kong (Augustine); J.S. Kooner (Jaspal); M. Laakso (Markku); E. Lakatta (Edward); V. Mooser (Vincent); L. Peltonen (Leena Johanna); N.J. Samani (Nilesh); T.D. Spector (Timothy); D.P. Strachan (David); T. Tanaka (Toshiko); J. Tuomilehto (Jaakko); A.G. Uitterlinden (André); P. Tikka-Kleemola (Päivi); N.J. Wareham (Nick); H. Watkins (Hugh); D. Waterworth (Dawn); M. Boehnke (Michael); P. Deloukas (Panagiotis); L. Groop (Leif); D.J. Hunter (David); U. Thorsteinsdottir (Unnur); D. Schlessinger (David); H.E. Wichmann (Erich); T.M. Frayling (Timothy); G.R. Abecasis (Gonçalo); J.N. Hirschhorn (Joel); R.J.F. Loos (Ruth); J-A. Zwart (John-Anker); K.L. Mohlke (Karen); I. Barroso (Inês); M.I. McCarthy (Mark)

    2009-01-01

    textabstractTo identify genetic loci influencing central obesity and fat distribution, we performed a meta-analysis of 16 genome-wide association studies (GWAS, N = 38,580) informative for adult waist circumference (WC) and waist-hip ratio (WHR). We selected 26 SNPs for follow-up, for which the evid

  5. Identifying Contingency Requirements using Obstacle Analysis on an Unpiloted Aerial Vehicle

    Science.gov (United States)

    Lutz, Robyn R.; Nelson, Stacy; Patterson-Hine, Ann; Frost, Chad R.; Tal, Doron

    2005-01-01

    This paper describes experience using Obstacle Analysis to identify contingency requirements on an unpiloted aerial vehicle. A contingency is an operational anomaly, and may or may not involve component failure. The challenges to this effort were: ( I ) rapid evolution of the system while operational, (2) incremental autonomy as capabilities were transferred from ground control to software control and (3) the eventual safety-criticality of such systems as they begin to fly over populated areas. The results reported here are preliminary but show that Obstacle Analysis helped (1) identify new contingencies that appeared as autonomy increased; (2) identify new alternatives for handling both previously known and new contingencies; and (3) investigate the continued validity of existing software requirements for contingency handling. Since many mobile, intelligent systems are built using a development process that poses the same challenges, the results appear to have applicability to other similar systems.

  6. Identifying the "Right Stuff": An Exploration-Focused Astronaut Job Analysis

    Science.gov (United States)

    Barrett, J. D.; Holland, A. W.; Vessey, W. B.

    2015-01-01

    Industrial and organizational (I/O) psychologists play a key role in NASA astronaut candidate selection through the identification of the competencies necessary to successfully engage in the astronaut job. A set of psychosocial competencies, developed by I/O psychologists during a prior job analysis conducted in 1996 and updated in 2003, were identified as necessary for individuals working and living in the space shuttle and on the International Space Station (ISS). This set of competencies applied to the space shuttle and applies to current ISS missions, but may not apply to longer-duration or long-distance exploration missions. With the 2015 launch of the first 12- month ISS mission and the shift in the 2020s to missions beyond low earth orbit, the type of missions that astronauts will conduct and the environment in which they do their work will change dramatically, leading to new challenges for these crews. To support future astronaut selection, training, and research, I/O psychologists in NASA's Behavioral Health and Performance (BHP) Operations and Research groups engaged in a joint effort to conduct an updated analysis of the astronaut job for current and future operations. This project will result in the identification of behavioral competencies critical to performing the astronaut job, along with relative weights for each of the identified competencies, through the application of job analysis techniques. While this job analysis is being conducted according to job analysis best practices, the project poses a number of novel challenges. These challenges include the need to identify competencies for multiple mission types simultaneously, to evaluate jobs that have no incumbents as they have never before been conducted, and working with a very limited population of subject matter experts. Given these challenges, under the guidance of job analysis experts, we used the following methods to conduct the job analysis and identify the key competencies for current and

  7. Research fronts analysis : A bibliometric to identify emerging fields of research

    Science.gov (United States)

    Miwa, Sayaka; Ando, Satoko

    Research fronts analysis identifies emerging areas of research through observing co-clustering in highly-cited papers. This article introduces the concept of research fronts analysis, explains its methodology and provides case examples. It also demonstrates developing research fronts in Japan by looking at the past winners of Thomson Reuters Research Fronts Awards. Research front analysis is currently being used by the Japanese government to determine new trends in science and technology. Information professionals can also utilize this bibliometric as a research evaluation tool.

  8. NCI60 cancer cell line panel data and RNAi analysis help identify EAF2 as a modulator of simvastatin and lovastatin response in HCT-116 cells.

    Directory of Open Access Journals (Sweden)

    Sevtap Savas

    Full Text Available Simvastatin and lovastatin are statins traditionally used for lowering serum cholesterol levels. However, there exists evidence indicating their potential chemotherapeutic characteristics in cancer. In this study, we used bioinformatic analysis of publicly available data in order to systematically identify the genes involved in resistance to cytotoxic effects of these two drugs in the NCI60 cell line panel. We used the pharmacological data available for all the NCI60 cell lines to classify simvastatin or lovastatin resistant and sensitive cell lines, respectively. Next, we performed whole-genome single marker case-control association tests for the lovastatin and simvastatin resistant and sensitive cells using their publicly available Affymetrix 125K SNP genomic data. The results were then evaluated using RNAi methodology. After correction of the p-values for multiple testing using False Discovery Rate, our results identified three genes (NRP1, COL13A1, MRPS31 and six genes (EAF2, ANK2, AKAP7, STEAP2, LPIN2, PARVB associated with resistance to simvastatin and lovastatin, respectively. Functional validation using RNAi confirmed that silencing of EAF2 expression modulated the response of HCT-116 colon cancer cells to both statins. In summary, we have successfully utilized the publicly available data on the NCI60 cell lines to perform whole-genome association studies for simvastatin and lovastatin. Our results indicated genes involved in the cellular response to these statins and siRNA studies confirmed the role of the EAF2 in response to these drugs in HCT-116 colon cancer cells.

  9. Metabolites production improvement by identifying minimal genomes and essential genes using flux balance analysis.

    Science.gov (United States)

    Salleh, Abdul Hakim Mohamed; Mohamad, Mohd Saberi; Deris, Safaai; Illias, Rosli Md

    2015-01-01

    With the advancement in metabolic engineering technologies, reconstruction of the genome of host organisms to achieve desired phenotypes can be made. However, due to the complexity and size of the genome scale metabolic network, significant components tend to be invisible. We proposed an approach to improve metabolite production that consists of two steps. First, we find the essential genes and identify the minimal genome by a single gene deletion process using Flux Balance Analysis (FBA) and second by identifying the significant pathway for the metabolite production using gene expression data. A genome scale model of Saccharomyces cerevisiae for production of vanillin and acetate is used to test this approach. The result has shown the reliability of this approach to find essential genes, reduce genome size and identify production pathway that can further optimise the production yield. The identified genes and pathways can be extendable to other applications especially in strain optimisation. PMID:26489144

  10. New families of human regulatory RNA structures identified by comparative analysis of vertebrate genomes

    DEFF Research Database (Denmark)

    Parker, Brian John; Moltke, Ida; Roth, Adam;

    2011-01-01

    comparative method, EvoFam, for genome-wide identification of families of regulatory RNA structures, based on primary sequence and secondary structure similarity. We apply EvoFam to a 41-way genomic vertebrate alignment. Genome-wide, we identify 220 human, high-confidence families outside protein......-coding regions comprising 725 individual structures, including 48 families with known structural RNA elements. Known families identified include both noncoding RNAs, e.g., miRNAs and the recently identified MALAT1/MEN β lincRNA family; and cis-regulatory structures, e.g., iron-responsive elements. We also...... identify tens of new families supported by strong evolutionary evidence and other statistical evidence, such as GO term enrichments. For some of these, detailed analysis has led to the formulation of specific functional hypotheses. Examples include two hypothesized auto-regulatory feedback mechanisms: one...

  11. Gene expression meta-analysis identifies metastatic pathways and transcription factors in breast cancer

    International Nuclear Information System (INIS)

    Metastasis is believed to progress in several steps including different pathways but the determination and understanding of these mechanisms is still fragmentary. Microarray analysis of gene expression patterns in breast tumors has been used to predict outcome in recent studies. Besides classification of outcome, these global expression patterns may reflect biological mechanisms involved in metastasis of breast cancer. Our purpose has been to investigate pathways and transcription factors involved in metastasis by use of gene expression data sets. We have analyzed 8 publicly available gene expression data sets. A global approach, 'gene set enrichment analysis' as well as an approach focusing on a subset of significantly differently regulated genes, GenMAPP, has been applied to rank pathway gene sets according to differential regulation in metastasizing tumors compared to non-metastasizing tumors. Meta-analysis has been used to determine overrepresentation of pathways and transcription factors targets, concordant deregulated in metastasizing breast tumors, in several data sets. The major findings are up-regulation of cell cycle pathways and a metabolic shift towards glucose metabolism reflected in several pathways in metastasizing tumors. Growth factor pathways seem to play dual roles; EGF and PDGF pathways are decreased, while VEGF and sex-hormone pathways are increased in tumors that metastasize. Furthermore, migration, proteasome, immune system, angiogenesis, DNA repair and several signal transduction pathways are associated to metastasis. Finally several transcription factors e.g. E2F, NFY, and YY1 are identified as being involved in metastasis. By pathway meta-analysis many biological mechanisms beyond major characteristics such as proliferation are identified. Transcription factor analysis identifies a number of key factors that support central pathways. Several previously proposed treatment targets are identified and several new pathways that may

  12. A cross-species genetic analysis identifies candidate genes for mouse anxiety and human bipolar disorder

    Directory of Open Access Journals (Sweden)

    David G Ashbrook

    2015-07-01

    Full Text Available Bipolar disorder (BD is a significant neuropsychiatric disorder with a lifetime prevalence of ~1%. To identify genetic variants underlying BD genome-wide association studies (GWAS have been carried out. While many variants of small effect associated with BD have been identified few have yet been confirmed, partly because of the low power of GWAS due to multiple comparisons being made. Complementary mapping studies using murine models have identified genetic variants for behavioral traits linked to BD, often with high power, but these identified regions often contain too many genes for clear identification of candidate genes. In the current study we have aligned human BD GWAS results and mouse linkage studies to help define and evaluate candidate genes linked to BD, seeking to use the power of the mouse mapping with the precision of GWAS. We use quantitative trait mapping for open field test and elevated zero maze data in the largest mammalian model system, the BXD recombinant inbred mouse population, to identify genomic regions associated with these BD-like phenotypes. We then investigate these regions in whole genome data from the Psychiatric Genomics Consortium’s bipolar disorder GWAS to identify candidate genes associated with BD. Finally we establish the biological relevance and pathways of these genes in a comprehensive systems genetics analysis.We identify four genes associated with both mouse anxiety and human BD. While TNR is a novel candidate for BD, we can confirm previously suggested associations with CMYA5, MCTP1 and RXRG. A cross-species, systems genetics analysis shows that MCTP1, RXRG and TNR coexpress with genes linked to psychiatric disorders and identify the striatum as a potential site of action. CMYA5, MCTP1, RXRG and TNR are associated with mouse anxiety and human BD. We hypothesize that MCTP1, RXRG and TNR influence intercellular signaling in the striatum.

  13. A cross-species genetic analysis identifies candidate genes for mouse anxiety and human bipolar disorder.

    Science.gov (United States)

    Ashbrook, David G; Williams, Robert W; Lu, Lu; Hager, Reinmar

    2015-01-01

    Bipolar disorder (BD) is a significant neuropsychiatric disorder with a lifetime prevalence of ~1%. To identify genetic variants underlying BD genome-wide association studies (GWAS) have been carried out. While many variants of small effect associated with BD have been identified few have yet been confirmed, partly because of the low power of GWAS due to multiple comparisons being made. Complementary mapping studies using murine models have identified genetic variants for behavioral traits linked to BD, often with high power, but these identified regions often contain too many genes for clear identification of candidate genes. In the current study we have aligned human BD GWAS results and mouse linkage studies to help define and evaluate candidate genes linked to BD, seeking to use the power of the mouse mapping with the precision of GWAS. We use quantitative trait mapping for open field test and elevated zero maze data in the largest mammalian model system, the BXD recombinant inbred mouse population, to identify genomic regions associated with these BD-like phenotypes. We then investigate these regions in whole genome data from the Psychiatric Genomics Consortium's bipolar disorder GWAS to identify candidate genes associated with BD. Finally we establish the biological relevance and pathways of these genes in a comprehensive systems genetics analysis. We identify four genes associated with both mouse anxiety and human BD. While TNR is a novel candidate for BD, we can confirm previously suggested associations with CMYA5, MCTP1, and RXRG. A cross-species, systems genetics analysis shows that MCTP1, RXRG, and TNR coexpress with genes linked to psychiatric disorders and identify the striatum as a potential site of action. CMYA5, MCTP1, RXRG, and TNR are associated with mouse anxiety and human BD. We hypothesize that MCTP1, RXRG, and TNR influence intercellular signaling in the striatum. PMID:26190982

  14. A Survey of Bioinformatics Database and Software Usage through Mining the Literature

    Science.gov (United States)

    Nenadic, Goran; Filannino, Michele; Brass, Andy; Robertson, David L.; Stevens, Robert

    2016-01-01

    Computer-based resources are central to much, if not most, biological and medical research. However, while there is an ever expanding choice of bioinformatics resources to use, described within the biomedical literature, little work to date has provided an evaluation of the full range of availability or levels of usage of database and software resources. Here we use text mining to process the PubMed Central full-text corpus, identifying mentions of databases or software within the scientific literature. We provide an audit of the resources contained within the biomedical literature, and a comparison of their relative usage, both over time and between the sub-disciplines of bioinformatics, biology and medicine. We find that trends in resource usage differs between these domains. The bioinformatics literature emphasises novel resource development, while database and software usage within biology and medicine is more stable and conservative. Many resources are only mentioned in the bioinformatics literature, with a relatively small number making it out into general biology, and fewer still into the medical literature. In addition, many resources are seeing a steady decline in their usage (e.g., BLAST, SWISS-PROT), though some are instead seeing rapid growth (e.g., the GO, R). We find a striking imbalance in resource usage with the top 5% of resource names (133 names) accounting for 47% of total usage, and over 70% of resources extracted being only mentioned once each. While these results highlight the dynamic and creative nature of bioinformatics research they raise questions about software reuse, choice and the sharing of bioinformatics practice. Is it acceptable that so many resources are apparently never reused? Finally, our work is a step towards automated extraction of scientific method from text. We make the dataset generated by our study available under the CC0 license here: http://dx.doi.org/10.6084/m9.figshare.1281371. PMID:27331905

  15. A Survey of Bioinformatics Database and Software Usage through Mining the Literature.

    Science.gov (United States)

    Duck, Geraint; Nenadic, Goran; Filannino, Michele; Brass, Andy; Robertson, David L; Stevens, Robert

    2016-01-01

    Computer-based resources are central to much, if not most, biological and medical research. However, while there is an ever expanding choice of bioinformatics resources to use, described within the biomedical literature, little work to date has provided an evaluation of the full range of availability or levels of usage of database and software resources. Here we use text mining to process the PubMed Central full-text corpus, identifying mentions of databases or software within the scientific literature. We provide an audit of the resources contained within the biomedical literature, and a comparison of their relative usage, both over time and between the sub-disciplines of bioinformatics, biology and medicine. We find that trends in resource usage differs between these domains. The bioinformatics literature emphasises novel resource development, while database and software usage within biology and medicine is more stable and conservative. Many resources are only mentioned in the bioinformatics literature, with a relatively small number making it out into general biology, and fewer still into the medical literature. In addition, many resources are seeing a steady decline in their usage (e.g., BLAST, SWISS-PROT), though some are instead seeing rapid growth (e.g., the GO, R). We find a striking imbalance in resource usage with the top 5% of resource names (133 names) accounting for 47% of total usage, and over 70% of resources extracted being only mentioned once each. While these results highlight the dynamic and creative nature of bioinformatics research they raise questions about software reuse, choice and the sharing of bioinformatics practice. Is it acceptable that so many resources are apparently never reused? Finally, our work is a step towards automated extraction of scientific method from text. We make the dataset generated by our study available under the CC0 license here: http://dx.doi.org/10.6084/m9.figshare.1281371. PMID:27331905

  16. Analysis of promoter regions of co-expressed genes identified by microarray analysis

    Directory of Open Access Journals (Sweden)

    Höglund Mattias

    2006-08-01

    Full Text Available Abstract Background The use of global gene expression profiling to identify sets of genes with similar expression patterns is rapidly becoming a widespread approach for understanding biological processes. A logical and systematic approach to study co-expressed genes is to analyze their promoter sequences to identify transcription factors that may be involved in establishing specific profiles and that may be experimentally investigated. Results We introduce promoter clustering i.e. grouping of promoters with respect to their high scoring motif content, and show that this approach greatly enhances the identification of common and significant transcription factor binding sites (TFBS in co-expressed genes. We apply this method to two different dataset, one consisting of micro array data from 108 leukemias (AMLs and a second from a time series experiment, and show that biologically relevant promoter patterns may be obtained using phylogenetic foot-printing methodology. In addition, we also found that 15% of the analyzed promoter regions contained transcription factors start sites for additional genes transcribed in the opposite direction. Conclusion Promoter clustering based on global promoter features greatly improve the identification of shared TFBS in co-expressed genes. We believe that the outlined approach may be a useful first step to identify transcription factors that contribute to specific features of gene expression profiles.

  17. An "in silico" Bioinformatics Laboratory Manual for Bioscience Departments: "Prediction of Glycosylation Sites in Phosphoethanolamine Transferases"

    Science.gov (United States)

    Alyuruk, Hakan; Cavas, Levent

    2014-01-01

    Genomics and proteomics projects have produced a huge amount of raw biological data including DNA and protein sequences. Although these data have been stored in data banks, their evaluation is strictly dependent on bioinformatics tools. These tools have been developed by multidisciplinary experts for fast and robust analysis of biological data.…

  18. NMR structure improvement: A structural bioinformatics & visualization approach

    Science.gov (United States)

    Block, Jeremy N.

    The overall goal of this project is to enhance the physical accuracy of individual models in macromolecular NMR (Nuclear Magnetic Resonance) structures and the realism of variation within NMR ensembles of models, while improving agreement with the experimental data. A secondary overall goal is to combine synergistically the best aspects of NMR and crystallographic methodologies to better illuminate the underlying joint molecular reality. This is accomplished by using the powerful method of all-atom contact analysis (describing detailed sterics between atoms, including hydrogens); new graphical representations and interactive tools in 3D and virtual reality; and structural bioinformatics approaches to the expanded and enhanced data now available. The resulting better descriptions of macromolecular structure and its dynamic variation enhances the effectiveness of the many biomedical applications that depend on detailed molecular structure, such as mutational analysis, homology modeling, molecular simulations, protein design, and drug design.

  19. Integrative content-driven concepts for bioinformatics ``beyond the cell"

    Indian Academy of Sciences (India)

    Edgar Wingender; Torsten Crass; Jennifer D Hogan; Alexander E Kel; Olga V Kel-Margoulis; Anatolij P Potapov

    2007-01-01

    Bioinformatics has delivered great contributions to genome and genomics research, without which the world-wide success of this and other global (‘omics’) approaches would not have been possible. More recently, it has developed further towards the analysis of different kinds of networks thus laying the foundation for comprehensive description, analysis and manipulation of whole living systems in modern ``systems biology”. The next step which is necessary for developing a systems biology that deals with systemic phenomena is to expand the existing and develop new methodologies that are appropriate to characterize intercellular processes and interactions without omitting the causal underlying molecular mechanisms. Modelling the processes on the different levels of complexity involved requires a comprehensive integration of information on gene regulatory events, signal transduction pathways, protein interaction and metabolic networks as well as cellular functions in the respective tissues/organs.

  20. Quantum Bio-Informatics II From Quantum Information to Bio-Informatics

    Science.gov (United States)

    Accardi, L.; Freudenberg, Wolfgang; Ohya, Masanori

    2009-02-01

    / H. Kamimura -- Massive collection of full-length complementary DNA clones and microarray analyses: keys to rice transcriptome analysis / S. Kikuchi -- Changes of influenza A(H5) viruses by means of entropic chaos degree / K. Sato and M. Ohya -- Basics of genome sequence analysis in bioinformatics - its fundamental ideas and problems / T. Suzuki and S. Miyazaki -- A basic introduction to gene expression studies using microarray expression data analysis / D. Wanke and J. Kilian -- Integrating biological perspectives: a quantum leap for microarray expression analysis / D. Wanke ... [et al.].

  1. Introducing bioinformatics, the biosciences' genomic revolution

    CERN Document Server

    Zanella, Paolo

    1999-01-01

    The general audience for these lectures is mainly physicists, computer scientists, engineers or the general public wanting to know more about what’s going on in the biosciences. What’s bioinformatics and why is all this fuss being made about it ? What’s this revolution triggered by the human genome project ? Are there any results yet ? What are the problems ? What new avenues of research have been opened up ? What about the technology ? These new developments will be compared with what happened at CERN earlier in its evolution, and it is hoped that the similiraties and contrasts will stimulate new curiosity and provoke new thoughts.

  2. PlantPAN: Plant promoter analysis navigator, for identifying combinatorial cis-regulatory elements with distance constraint in plant gene groups

    Directory of Open Access Journals (Sweden)

    Huang Hsien-Da

    2008-11-01

    Full Text Available Abstract Background The elucidation of transcriptional regulation in plant genes is important area of research for plant scientists, following the mapping of various plant genomes, such as A. thaliana, O. sativa and Z. mays. A variety of bioinformatic servers or databases of plant promoters have been established, although most have been focused only on annotating transcription factor binding sites in a single gene and have neglected some important regulatory elements (tandem repeats and CpG/CpNpG islands in promoter regions. Additionally, the combinatorial interaction of transcription factors (TFs is important in regulating the gene group that is associated with the same expression pattern. Therefore, a tool for detecting the co-regulation of transcription factors in a group of gene promoters is required. Results This study develops a database-assisted system, PlantPAN (Plant Promoter Analysis Navigator, for recognizing combinatorial cis-regulatory elements with a distance constraint in sets of plant genes. The system collects the plant transcription factor binding profiles from PLACE, TRANSFAC (public release 7.0, AGRIS, and JASPER databases and allows users to input a group of gene IDs or promoter sequences, enabling the co-occurrence of combinatorial transcription factor binding sites (TFBSs within a defined distance (20 bp to 200 bp to be identified. Furthermore, the new resource enables other regulatory features in a plant promoter, such as CpG/CpNpG islands and tandem repeats, to be displayed. The regulatory elements in the conserved regions of the promoters across homologous genes are detected and presented. Conclusion In addition to providing a user-friendly input/output interface, PlantPAN has numerous advantages in the analysis of a plant promoter. Several case studies have established the effectiveness of PlantPAN. This novel analytical resource is now freely available at http://PlantPAN.mbc.nctu.edu.tw.

  3. Bioinformatics in crosslinking chemistry of collagen with selective cross linkers

    Directory of Open Access Journals (Sweden)

    Gopal Ramesh

    2011-10-01

    Full Text Available Abstract Background Identifying the molecular interactions using bioinformatics tools before venturing into wet lab studies saves the energy and time considerably. The present study summarizes, molecular interactions and binding energy calculations made for major structural protein, collagen of Type I and Type III with the chosen cross-linkers, namely, coenzyme Q10, dopaquinone, embelin, embelin complex-1 & 2, idebenone, 5-O-methyl embelin, potassium embelate and vilangin. Results Molecular descriptive analyses suggest, dopaquinone, embelin, idebenone, 5-O-methyl embelin, and potassium embelate display nil violations. And results of docking analyses revealed, best affinity for Type I (- 4.74 kcal/mol and type III (-4.94 kcal/mol collagen was with dopaquinone. Conclusions Among the selected cross-linkers, dopaquinone, embelin, potassium embelate and 5-O-methyl embelin were the suitable cross-linkers for both Type I and Type III collagen and stabilizes the collagen at the expected level.

  4. Probabilistic approach to identify sensitive parameter distributions in multimedia pathway analysis.

    Energy Technology Data Exchange (ETDEWEB)

    Kamboj, S.; Gnanapragasam, E.; LePoire, D.; Biwer, B. M.; Cheng, J.; Arnish, J.; Yu, C.; Chen, S. Y.; Mo, T.; Abu-Eid, R.; Thaggard, M.; Environmental Assessment; NRC

    2002-01-01

    Sensitive parameter distributions were identified with the use of probabilistic analysis in the RESRAD computer code. RESRAD is a multimedia pathway analysis code designed to evaluate radiological exposures resulting from radiological contamination in soil. The dose distribution was obtained by using a set of default parameter distribution/values. Most of the variations in the output dose distribution could be attributed to uncertainty in a small set of input parameters that could be considered as sensitive parameter distributions. The identification of the sensitive parameters is a first step in the prioritization of future research and information gathering. When site-specific parameter distribution/values are available for an actual site, the same process should be used with these site-specific data. Regression analysis used to identify sensitive parameters indicated that the dominant pathways depended on the radionuclide and source configurations. However, two parameter distributions were sensitive for many radionuclides: the external shielding factor when external exposure was the dominant pathway and the plant transfer factor when plant ingestion was the dominant pathway. No single correlation or regression coefficient can be used alone to identify sensitive parameters in all the cases. The coefficients are useful guides, but they have to be used in conjunction with other aids, such as scatter plots, and should undergo further analysis.

  5. The discrepancies in the results of bioinformatics tools for genomic structural annotation

    Science.gov (United States)

    Pawełkowicz, Magdalena; Nowak, Robert; Osipowski, Paweł; Rymuszka, Jacek; Świerkula, Katarzyna; Wojcieszek, Michał; Przybecki, Zbigniew

    2014-11-01

    A major focus of sequencing project is to identify genes in genomes. However it is necessary to define the variety of genes and the criteria for identifying them. In this work we present discrepancies and dependencies from the application of different bioinformatic programs for structural annotation performed on the cucumber data set from Polish Consortium of Cucumber Genome Sequencing. We use Fgenesh, GenScan and GeneMark to automated structural annotation, the results have been compared to reference annotation.

  6. Bioinformatic and Expression Analysis of Rice Ubiquitin-conjugating Enzyme Gene Family%水稻泛素结合酶基因家族的生物信息学与表达分析

    Institute of Scientific and Technical Information of China (English)

    刘鑫; 张恒; 阚虎飞; 周立帅; 黄昊; 宋林林; 翟焕趁; 张君; 鲁国东

    2016-01-01

    The ubiquitin /proteasome system plays an important role in plant growth and development ,morphogenesis and disease resistance .Recent studies have shown that some pathogens can mimic the host plant ubiquitin /proteasome system components to achieve their own purposes .Ubiquitin‐conjugating enzyme is the second enzyme in the ubiquitination process and is indispensable for the plant ubiquitin/proteasome system .Previous studies showed that there are 48 predicted ubiquitin‐conjugating enzyme genes in rice genome .In order to preliminarily elucidate the functions of rice ubiquitin‐conjugating enzyme genes in plant disease resistance ,bioinformatic ,RNA‐seq and qRT‐PCR methods were used to analyze characteristics and expression patterns of rice ubiquitin‐conjugating enzyme gene family . Phylogenetic tree analyses indicate that the 48 rice ubiquitin‐conjugating enzyme genes can be divided into 3 groups ,7 sub‐groups in total .Protein domain analysis showed that ubiquitin‐conjugating enzyme genes mainly consist of a big ubiquitin‐conjugating enzyme catalytic domain .Expression analysis in silico suggested that most of the rice ubiquitin‐conjugating enzymes can be induced by blast fungus infection .Plant cis‐acting elements analysis indicated that four pathogen resistance cis‐acting elements and one hypersensitivity reaction cis‐acting element have high distribution in the promoter region of the 48 rice ubiquitin‐conjugating enzyme genes .RNA‐seq data from compatible and incompatible monogenic rice after rice blast fungus infection showed that 44 rice ubiquitin‐conjugating enzyme genes were expressed at 36 hours after treatment ,among which more than 50% were highly expressed genes .qRT‐PCR analysis showed that expression of some ubiquitin‐conjugating enzyme genes can be induced by the inoculation of rice blast fungus both in compatible and incompatible monogenic rice .However ,in incompatible rice the expression of rice ubiquitin

  7. Transcriptome analysis of recurrently deregulated genes across multiple cancers identifies new pan-cancer biomarkers

    DEFF Research Database (Denmark)

    Kaczkowski, Bogumil; Tanaka, Yuji; Kawaji, Hideya; Sandelin, Albin; Andersson, Robin; Itoh, Masayoshi; Lassmann, Timo; Hayashizaki, Yoshihide; Carninci, Piero; Forrest, Alistair R

    2015-01-01

    Genes that are commonly deregulated in cancer are clinically attractive as candidate pan-diagnostic markers and therapeutic targets. To globally identify such targets, we compared Cap Analysis of Gene Expression (CAGE) profiles from 225 different cancer cell lines and 339 corresponding primary cell...... samples to identify transcripts that are deregulated recurrently in a broad range of cancer types. Comparing RNA-seq data from 4,055 tumors and 563 normal tissues profiled in the TCGA and FANTOM5 datasets, we identified a core transcript set with theranostic potential. Our analyses also revealed enhancer...... RNAs which are upregulated in cancer, defining promoters which overlap with repetitive elements (especially SINE/Alu and LTR/ERV1 elements) that are often upregulated in cancer. Lastly, we documented for the first time upregulation of multiple copies of the REP522 interspersed repeat in cancer. Overall...

  8. Identifying Population Groups with Low Palliative Care Program Enrolment Using Classification and Regression Tree Analysis

    Science.gov (United States)

    Gao, Jun; Lavergne, M. Ruth; McIntyre, Paul

    2013-01-01

    Classification and regression tree (CART) analysis was used to identify subpopulations with lower palliative care program (PCP) enrolment rates. CART analysis uses recursive partitioning to group predictors. The PCP enrolment rate was 72 percent for the 6,892 adults who died of cancer from 2000 and 2005 in two counties in Nova Scotia, Canada. The lowest PCP enrolment rates were for nursing home residents over 82 years (27 percent), a group residing more than 43 kilometres from the PCP (31 percent), and another group living less than two weeks after their cancer diagnosis (37 percent). The highest rate (86 percent) was for the 2,118 persons who received palliative radiation. Findings from multiple logistic regression (MLR) were provided for comparison. CART findings identified low PCP enrolment subpopulations that were defined by interactions among demographic, social, medical, and health system predictors. PMID:21805944

  9. Emergent team roles in organizational meetings: Identifying communication patterns via cluster analysis.

    OpenAIRE

    Lehmann-Willenbrock, N.K.; Beck, S.J.; Kauffeld, S.

    2016-01-01

    Previous team role taxonomies have largely relied on self-report data, focused on functional roles, and described individual predispositions or personality traits. Instead, this study takes a communicative approach and proposes that team roles are produced, shaped, and sustained in communicative behaviors. To identify team roles communicatively, 59 regular organizational meetings were videotaped and analyzed. Cluster analysis revealed five emergent roles: the solution seeker, the problem anal...

  10. Identifying Gender-Preferred Communication Styles within Online Cancer Communities: A Retrospective, Longitudinal Analysis

    OpenAIRE

    Durant, Kathleen T.; McCray, Alexa T.; Charles Safran

    2012-01-01

    BACKGROUND: The goal of this research is to determine if different gender-preferred social styles can be observed within the user interactions at an online cancer community. To achieve this goal, we identify and measure variables that pertain to each gender-specific social style. METHODS AND FINDINGS: We perform social network and statistical analysis on the communication flow of 8,388 members at six different cancer forums over eight years. Kruskal-Wallis tests were conducted to measure the ...

  11. Network analysis identifies protein clusters of functional importance in juvenile idiopathic arthritis

    OpenAIRE

    Stevens, Adam; Meyer, Stefan; Hanson, Daniel; Clayton, Peter; Donn, Rachelle

    2014-01-01

    Introduction Our objective was to utilise network analysis to identify protein clusters of greatest potential functional relevance in the pathogenesis of oligoarticular and rheumatoid factor negative (RF-ve) polyarticular juvenile idiopathic arthritis (JIA). Methods JIA genetic association data were used to build an interactome network model in BioGRID 3.2.99. The top 10% of this protein:protein JIA Interactome was used to generate a minimal essential network (MEN). Reactome FI Cytoscape 2.83...

  12. Integrative Omics Analysis of Rheumatoid Arthritis Identifies Non-Obvious Therapeutic Targets

    OpenAIRE

    Whitaker, John W.; Boyle, David L.; Bartok, Beatrix; Ball, Scott T.; Gay, Steffen; Wang, Wei; Firestein, Gary S.

    2015-01-01

    Identifying novel therapeutic targets for the treatment of disease is challenging. To this end, we developed a genome-wide approach of candidate gene prioritization. We independently collocated sets of genes that were implicated in rheumatoid arthritis (RA) pathogenicity through three genome-wide assays: (i) genome-wide association studies (GWAS), (ii) differentially expression in RA fibroblast-like synoviocytes (FLS), and (iii) differentially methylation in RA FLS. Integrated analysis of the...

  13. Integrative omics analysis of rheumatoid arthritis identifies non-obvious therapeutic targets

    OpenAIRE

    Whitaker, John W.; Boyle, David L.; Bartok, Beatrix; Ball, Scott T.; Gay, Steffen; Wang, Wei; Firestein, Gary S.

    2015-01-01

    Identifying novel therapeutic targets for the treatment of disease is challenging. To this end, we developed a genome-wide approach of candidate gene prioritization. We independently collocated sets of genes that were implicated in rheumatoid arthritis (RA) pathogenicity through three genome-wide assays: (i) genome-wide association studies (GWAS), (ii) differentially expression in RA fibroblast-like synoviocytes (FLS), and (iii) differentially methylation in RA FLS. Integrated analysis of the...

  14. Robust Microarray Meta-Analysis Identifies Differentially Expressed Genes for Clinical Prediction

    OpenAIRE

    Phan, John H.; Andrew N. Young; Wang, May D.

    2012-01-01

    Combining multiple microarray datasets increases sample size and leads to improved reproducibility in identification of informative genes and subsequent clinical prediction. Although microarrays have increased the rate of genomic data collection, sample size is still a major issue when identifying informative genetic biomarkers. Because of this, feature selection methods often suffer from false discoveries, resulting in poorly performing predictive models. We develop a simple meta-analysis-ba...

  15. Identifying patterns in treatment response profiles in acute bipolar mania: a cluster analysis approach

    OpenAIRE

    Houston John P; Lipkovich Ilya A; Ahl Jonna

    2008-01-01

    Abstract Background Patients with acute mania respond differentially to treatment and, in many cases, fail to obtain or sustain symptom remission. The objective of this exploratory analysis was to characterize response in bipolar disorder by identifying groups of patients with similar manic symptom response profiles. Methods Patients (n = 222) were selected from a randomized, double-blind study of treatment with olanzapine or divalproex in bipolar I disorder, manic or mixed episode, with or w...

  16. Automated Source Code Analysis to Identify and Remove Software Security Vulnerabilities: Case Studies on Java Programs

    OpenAIRE

    2013-01-01

    The high-level contribution of this paper is to illustrate the development of generic solution strategies to remove software security vulnerabilities that could be identified using automated tools for source code analysis on software programs (developed in Java). We use the Source Code Analyzer and Audit Workbench automated tools, developed by HP Fortify Inc., for our testing purposes. We present case studies involving a file writer program embedded with features for password validation, and ...

  17. System reliability analysis using dominant failure modes identified by selective searching technique

    International Nuclear Information System (INIS)

    The failure of a redundant structural system is often described by innumerable system failure modes such as combinations or sequences of local failures. An efficient approach is proposed to identify dominant failure modes in the space of random variables, and then perform system reliability analysis to compute the system failure probability. To identify dominant failure modes in the decreasing order of their contributions to the system failure probability, a new simulation-based selective searching technique is developed using a genetic algorithm. The system failure probability is computed by a multi-scale matrix-based system reliability (MSR) method. Lower-scale MSR analyses evaluate the probabilities of the identified failure modes and their statistical dependence. A higher-scale MSR analysis evaluates the system failure probability based on the results of the lower-scale analyses. Three illustrative examples demonstrate the efficiency and accuracy of the approach through comparison with existing methods and Monte Carlo simulations. The results show that the proposed method skillfully identifies the dominant failure modes, including those neglected by existing approaches. The multi-scale MSR method accurately evaluates the system failure probability with statistical dependence fully considered. The decoupling between the failure mode identification and the system reliability evaluation allows for effective applications to larger structural systems

  18. Empowered genome community: leveraging a bioinformatics platform as a citizen-scientist collaboration tool.

    Science.gov (United States)

    Wendelsdorf, Katherine; Shah, Sohela

    2015-09-01

    There is on-going effort in the biomedical research community to leverage Next Generation Sequencing (NGS) technology to identify genetic variants that affect our health. The main challenge facing researchers is getting enough samples from individuals either sick or healthy - to be able to reliably identify the few variants that are causal for a phenotype among all other variants typically seen among individuals. At the same time, more and more individuals are having their genome sequenced either out of curiosity or to identify the cause of an illness. These individuals may benefit from of a way to view and understand their data. QIAGEN's Ingenuity Variant Analysis is an online application that allows users with and without extensive bioinformatics training to incorporate information from published experiments, genetic databases, and a variety of statistical models to identify variants, from a long list of candidates, that are most likely causal for a phenotype as well as annotate variants with what is already known about them in the literature and databases. Ingenuity Variant Analysis is also an information sharing platform where users may exchange samples and analyses. The Empowered Genome Community (EGC) is a new program in which QIAGEN is making this on-line tool freely available to any individual who wishes to analyze their own genetic sequence. EGC members are then able to make their data available to other Ingenuity Variant Analysis users to be used in research. Here we present and describe the Empowered Genome Community in detail. We also present a preliminary, proof-of-concept study that utilizes the 200 genomes currently available through the EGC. The goal of this program is to allow individuals to access and understand their own data as well as facilitate citizen-scientist collaborations that can drive research forward and spur quality scientific dialogue in the general public. PMID:27054071

  19. Empowered genome community: leveraging a bioinformatics platform as a citizen–scientist collaboration tool

    Directory of Open Access Journals (Sweden)

    Katherine Wendelsdorf

    2015-09-01

    Full Text Available There is on-going effort in the biomedical research community to leverage Next Generation Sequencing (NGS technology to identify genetic variants that affect our health. The main challenge facing researchers is getting enough samples from individuals either sick or healthy – to be able to reliably identify the few variants that are causal for a phenotype among all other variants typically seen among individuals. At the same time, more and more individuals are having their genome sequenced either out of curiosity or to identify the cause of an illness. These individuals may benefit from of a way to view and understand their data. QIAGEN's Ingenuity Variant Analysis is an online application that allows users with and without extensive bioinformatics training to incorporate information from published experiments, genetic databases, and a variety of statistical models to identify variants, from a long list of candidates, that are most likely causal for a phenotype as well as annotate variants with what is already known about them in the literature and databases. Ingenuity Variant Analysis is also an information sharing platform where users may exchange samples and analyses. The Empowered Genome Community (EGC is a new program in which QIAGEN is making this on-line tool freely available to any individual who wishes to analyze their own genetic sequence. EGC members are then able to make their data available to other Ingenuity Variant Analysis users to be used in research. Here we present and describe the Empowered Genome Community in detail. We also present a preliminary, proof-of-concept study that utilizes the 200 genomes currently available through the EGC. The goal of this program is to allow individuals to access and understand their own data as well as facilitate citizen–scientist collaborations that can drive research forward and spur quality scientific dialogue in the general public.

  20. Hot spot analysis applied to identify ecosystem services potential in Lithuania

    Science.gov (United States)

    Pereira, Paulo; Depellegrin, Daniel; Misiune, Ieva

    2016-04-01

    Hot spot analysis are very useful to identify areas with similar characteristics. This is important for a sustainable use of the territory, since we can identify areas that need to be protected, or restored. This is a great advantage in terms of land use planning and management, since we can allocate resources, reduce the economical costs and do a better intervention in the landscape. Ecosystem services (ES) are different according land use. Since landscape is very heterogeneous, it is of major importance understand their spatial pattern and where are located the areas that provide better ES and the others that provide less services. The objective of this work is to use hot-spot analysis to identify areas with the most valuable ES in Lithuania. CORINE land-cover (CLC) of 2006 was used as the main spatial information. This classification uses a grid of 100 m resolution and extracted a total of 31 land use types. ES ranking was carried out based on expert knowledge. They were asked to evaluate the ES potential of each different CLC from 0 (no potential) to 5 (very high potential). Hot spot analysis were evaluated using the Getis-ord test, which identifies cluster analysis available in ArcGIS toolbox. This tool identifies areas with significantly high low values and significant high values at a p level of 0.05. In this work we used hot spot analysis to assess the distribution of providing, regulating cultural and total (sum of the previous 3) ES. The Z value calculated from Getis-ord was used to statistical analysis to access the clusters of providing, regulating cultural and total ES. ES with high Z value show that they have a high number of cluster areas with high potential of ES. The results showed that the Z-score was significantly different among services (Kruskal Wallis ANOVA =834. 607, pcultural (0.080±1.979) and regulating (0.076±1.961). These results suggested that providing services are more clustered than the remaining. Ecosystem Services Z score were

  1. Identification of novel genes and pathways in carotid atheroma using integrated bioinformatic methods.

    Science.gov (United States)

    Nai, Wenqing; Threapleton, Diane; Lu, Jingbo; Zhang, Kewei; Wu, Hongyuan; Fu, You; Wang, Yuanyuan; Ou, Zejin; Shan, Lanlan; Ding, Yan; Yu, Yanlin; Dai, Meng

    2016-01-01

    Atherosclerosis is the primary cause of cardiovascular events and its molecular mechanism urgently needs to be clarified. In our study, atheromatous plaques (ATH) and macroscopically intact tissue (MIT) sampled from 32 patients were compared and an integrated series of bioinformatic microarray analyses were used to identify altered genes and pathways. Our work showed 816 genes were differentially expressed between ATH and MIT, including 443 that were up-regulated and 373 that were down-regulated in ATH tissues. GO functional-enrichment analysis for differentially expressed genes (DEGs) indicated that genes related to the "immune response" and "muscle contraction" were altered in ATHs. KEGG pathway-enrichment analysis showed that up-regulated DEGs were significantly enriched in the "FcεRI-mediated signaling pathway", while down-regulated genes were significantly enriched in the "transforming growth factor-β signaling pathway". Protein-protein interaction network and module analysis demonstrated that VAV1, SYK, LYN and PTPN6 may play critical roles in the network. Additionally, similar observations were seen in a validation study where SYK, LYN and PTPN6 were markedly elevated in ATH. All in all, identification of these genes and pathways not only provides new insights into the pathogenesis of atherosclerosis, but may also aid in the development of prognostic and therapeutic biomarkers for advanced atheroma. PMID:26742467

  2. Identifying Innovative Interventions to Promote Healthy Eating Using Consumption-Oriented Food Supply Chain Analysis.

    Science.gov (United States)

    Hawkes, Corinna

    2009-07-01

    The mapping and analysis of supply chains is a technique increasingly used to address problems in the food system. Yet such supply chain management has not yet been applied as a means of encouraging healthier diets. Moreover, most policies recommended to promote healthy eating focus on the consumer end of the chain. This article proposes a consumption-oriented food supply chain analysis to identify the changes needed in the food supply chain to create a healthier food environment, measured in terms of food availability, prices, and marketing. Along with established forms of supply chain analysis, the method is informed by a historical overview of how food supply chains have changed over time. The method posits that the actors and actions in the chain are affected by organizational, financial, technological, and policy incentives and disincentives, which can in turn be levered for change. It presents a preliminary example of the supply of Coca-Cola beverages into school vending machines and identifies further potential applications. These include fruit and vegetable supply chains, local food chains, supply chains for health-promoting versions of food products, and identifying financial incentives in supply chains for healthier eating. PMID:23144674

  3. Bioinformatics and the Politics of Innovation in the Life Sciences

    Science.gov (United States)

    Zhou, Yinhua; Datta, Saheli; Salter, Charlotte

    2016-01-01

    The governments of China, India, and the United Kingdom are unanimous in their belief that bioinformatics should supply the link between basic life sciences research and its translation into health benefits for the population and the economy. Yet at the same time, as ambitious states vying for position in the future global bioeconomy they differ considerably in the strategies adopted in pursuit of this goal. At the heart of these differences lies the interaction between epistemic change within the scientific community itself and the apparatus of the state. Drawing on desk-based research and thirty-two interviews with scientists and policy makers in the three countries, this article analyzes the politics that shape this interaction. From this analysis emerges an understanding of the variable capacities of different kinds of states and political systems to work with science in harnessing the potential of new epistemic territories in global life sciences innovation.

  4. Technical Perspectives on Knowledge Management in Bioinformatics Workflow Systems

    Directory of Open Access Journals (Sweden)

    Walaa N. Ismail

    2015-01-01

    Full Text Available Workflow systems by it’s nature can help bioin-formaticians to plan for their experiments, store, capture and analysis of the runtime generated data. On the other hand, the life science research usually produces new knowledge at an increasing speed; Knowledge such as papers, databases and other systems knowledge that a researcher needs to deal with is actually a complex task that needs much of efforts and time. Thus the management of knowledge is therefore an important issue for life scientists. Approaches has been developed to organize biological knowledge sources and to record provenance knowledge of an experiment into a readily resource are presently being carried out. This article focuses on the knowledge management of in silico experimentation in bioinformatics workflow systems.

  5. Gene expression signature analysis identifies vorinostat as a candidate therapy for gastric cancer.

    Directory of Open Access Journals (Sweden)

    Sofie Claerhout

    Full Text Available BACKGROUND: Gastric cancer continues to be one of the deadliest cancers in the world and therefore identification of new drugs targeting this type of cancer is thus of significant importance. The purpose of this study was to identify and validate a therapeutic agent which might improve the outcomes for gastric cancer patients in the future. METHODOLOGY/PRINCIPAL FINDINGS: Using microarray technology, we generated a gene expression profile of human gastric cancer-specific genes from human gastric cancer tissue samples. We used this profile in the Broad Institute's Connectivity Map analysis to identify candidate therapeutic compounds for gastric cancer. We found the histone deacetylase inhibitor vorinostat as the lead compound and thus a potential therapeutic drug for gastric cancer. Vorinostat induced both apoptosis and autophagy in gastric cancer cell lines. Pharmacological and genetic inhibition of autophagy however, increased the therapeutic efficacy of vorinostat, indicating that a combination of vorinostat with autophagy inhibitors may therapeutically be more beneficial. Moreover, gene expression analysis of gastric cancer identified a collection of genes (ITGB5, TYMS, MYB, APOC1, CBX5, PLA2G2A, and KIF20A whose expression was elevated in gastric tumor tissue and downregulated more than 2-fold by vorinostat treatment in gastric cancer cell lines. In contrast, SCGB2A1, TCN1, CFD, APLP1, and NQO1 manifested a reversed pattern. CONCLUSIONS/SIGNIFICANCE: We showed that analysis of gene expression signature may represent an emerging approach to discover therapeutic agents for gastric cancer, such as vorinostat. The observation of altered gene expression after vorinostat treatment may provide the clue to identify the molecular mechanism of vorinostat and those patients likely to benefit from vorinostat treatment.

  6. Metagenomic analysis of viruses associated with field-grown and retail lettuce identifies human and animal viruses.

    Science.gov (United States)

    Aw, Tiong Gim; Wengert, Samantha; Rose, Joan B

    2016-04-16

    The emergence of culture- and sequence-independent metagenomic methods has not only provided great insight into the microbial community structure in a wide range of clinical and environmental samples but has also proven to be powerful tools for pathogen detection. Recent studies of the food microbiome have revealed the vast genetic diversity of bacteria associated with fresh produce. However, no work has been done to apply metagenomic methods to tackle viruses associated with fresh produce for addressing food safety. Thus, there is a little knowledge about the presence and diversity of viruses associated with fresh produce from farm-to-fork. To address this knowledge gap, we assessed viruses on commercial romaine and iceberg lettuces in fields and a produce distribution center using a shotgun metagenomic sequencing targeting both RNA and DNA viruses. Commercial lettuce harbors an immense assemblage of viruses that infect a wide range of hosts. As expected, plant pathogenic viruses dominated these communities. Sequences of rotaviruses and picobirnaviruses were also identified in both field-harvest and retail lettuce samples, suggesting an emerging foodborne transmission threat that has yet to be fully recognized. The identification of human and animal viruses in lettuce samples in the field emphasizes the importance of preventing viral contamination on leafy greens starting at the field. Although there are still some inherent experimental and bioinformatics challenges in applying viral metagenomic approaches for food safety testing, this work will facilitate further application of this unprecedented deep sequencing method to food samples. PMID:26894328

  7. Bioinformatics for Diagnostics, Forensics, and Virulence Characterization and Detection

    Energy Technology Data Exchange (ETDEWEB)

    Gardner, S; Slezak, T

    2005-04-05

    We summarize four of our group's high-risk/high-payoff research projects funded by the Intelligence Technology Innovation Center (ITIC) in conjunction with our DHS-funded pathogen informatics activities. These are (1) quantitative assessment of genomic sequencing needs to predict high quality DNA and protein signatures for detection, and comparison of draft versus finished sequences for diagnostic signature prediction; (2) development of forensic software to identify SNP and PCR-RFLP variations from a large number of viral pathogen sequences and optimization of the selection of markers for maximum discrimination of those sequences; (3) prediction of signatures for the detection of virulence, antibiotic resistance, and toxin genes and genetic engineering markers in bacteria; (4) bioinformatic characterization of virulence factors to rapidly screen genomic data for potential genes with similar functions and to elucidate potential health threats in novel organisms. The results of (1) are being used by policy makers to set national sequencing priorities. Analyses from (2) are being used in collaborations with the CDC to genotype and characterize many variola strains, and reports from these collaborations have been made to the President. We also determined SNPs for serotype and strain discrimination of 126 foot and mouth disease virus (FMDV) genomes. For (3), currently >1000 probes have been predicted for the specific detection of >4000 virulence, antibiotic resistance, and genetic engineering vector sequences, and we expect to complete the bioinformatic design of a comprehensive ''virulence detection chip'' by August 2005. Results of (4) will be a system to rapidly predict potential virulence pathways and phenotypes in organisms based on their genomic sequences.

  8. Development of Bioinformatic and Experimental Technologies for Identification of Prokaryotic Regulatory Networks

    Energy Technology Data Exchange (ETDEWEB)

    Lawrence, Charles E; McCue, Lee Ann

    2008-07-31

    The transcription regulatory network is arguably the most important foundation of cellular function, since it exerts the most fundamental control over the abundance of virtually all of a cell’s functional macromolecules. The two major components of a prokaryotic cell’s transcription regulation network are the transcription factors (TFs) and the transcription factor binding sites (TFBS); these components are connected by the binding of TFs to their cognate TFBS under appropriate environmental conditions. Comparative genomics has proven to be a powerful bioinformatics method with which to study transcription regulation on a genome-wide level. We have further extended comparative genomics technologies that we introduced over the last several years. Specifically, we developed and applied statistical approaches to analysis of correlated sequence data (i.e., sequences from closely related species). We also combined these technologies with functional genomic, proteomic and sequence data from multiple species, and developed computational technologies that provide inferences on the regulatory network connections, identifying the cognate transcription factor for predicted regulatory sites. Arguably the most important contribution of this work emerged in the course of the project. Specifically, the development of novel procedures of estimation and prediction in discrete high-D settings has broad implications for biology, genomics and well beyond. We showed that these procedures enjoy advantages over existing technologies in the identification of TBFS. These efforts are aimed toward identifying a cell’s complete transcription regulatory network and underlying molecular mechanisms.

  9. A bioinformatics approach to the determination of genes involved in endophytic behavior in Burkholderia spp.

    Science.gov (United States)

    Ali, Shimaila; Duan, Jin; Charles, Trevor C; Glick, Bernard R

    2014-02-21

    The vast majority of plants harbor endophytic bacteria that colonize a portion of the plant's interior tissues without harming the plant. Like plant pathogens, endophytes gain entry into their plants hosts through various mechanisms. Bacterial endophytes display a broad range of symbiotic interactions with their host plants. The molecular bases of these plant-endophyte interactions are currently not fully understood. In the present study, a set of genes possibly responsible for endophytic behavior for genus Burkholderia was predicted and then compared and contrasted with a number (nine endophytes from different genera) of endophytes by comparative genome analysis. The nine endophytes included Burkholderia phytofirmans PsJN, Burkholderia spp. strain JK006, Azospirillum lipoferum 4B, Enterobacter cloacae ENHKU01, Klebsiella pneumoniae 342, Pseudomonas putida W619, Enterobacter spp. 638, Azoarcus spp. BH72, and Serratia proteamaculans 568. From the genomes of the analyzed bacterial strains, a set of bacterial genes orthologs was identified that are predicted to be involved in determining the endophytic behavior of Burkholderia spp. The genes and their possible functions were then investigated to establish a potential connection between their presence and the role they play in bacterial endophytic behavior. Nearly all of the genes identified by this bioinformatics procedure encode function previously suggested in other studies to be involved in endophytic behavior. PMID:24513137

  10. Shortest-path network analysis is a useful approach toward identifying genetic determinants of longevity.

    Directory of Open Access Journals (Sweden)

    J R Managbanag

    Full Text Available BACKGROUND: Identification of genes that modulate longevity is a major focus of aging-related research and an area of intense public interest. In addition to facilitating an improved understanding of the basic mechanisms of aging, such genes represent potential targets for therapeutic intervention in multiple age-associated diseases, including cancer, heart disease, diabetes, and neurodegenerative disorders. To date, however, targeted efforts at identifying longevity-associated genes have been limited by a lack of predictive power, and useful algorithms for candidate gene-identification have also been lacking. METHODOLOGY/PRINCIPAL FINDINGS: We have utilized a shortest-path network analysis to identify novel genes that modulate longevity in Saccharomyces cerevisiae. Based on a set of previously reported genes associated with increased life span, we applied a shortest-path network algorithm to a pre-existing protein-protein interaction dataset in order to construct a shortest-path longevity network. To validate this network, the replicative aging potential of 88 single-gene deletion strains corresponding to predicted components of the shortest-path longevity network was determined. Here we report that the single-gene deletion strains identified by our shortest-path longevity analysis are significantly enriched for mutations conferring either increased or decreased replicative life span, relative to a randomly selected set of 564 single-gene deletion strains or to the current data set available for the entire haploid deletion collection. Further, we report the identification of previously unknown longevity genes, several of which function in a conserved longevity pathway believed to mediate life span extension in response to dietary restriction. CONCLUSIONS/SIGNIFICANCE: This work demonstrates that shortest-path network analysis is a useful approach toward identifying genetic determinants of longevity and represents the first application of

  11. Messina: a novel analysis tool to identify biologically relevant molecules in disease.

    Directory of Open Access Journals (Sweden)

    Mark Pinese

    Full Text Available BACKGROUND: Morphologically similar cancers display heterogeneous patterns of molecular aberrations and follow substantially different clinical courses. This diversity has become the basis for the definition of molecular phenotypes, with significant implications for therapy. Microarray or proteomic expression profiling is conventionally employed to identify disease-associated genes, however, traditional approaches for the analysis of profiling experiments may miss molecular aberrations which define biologically relevant subtypes. METHODOLOGY/PRINCIPAL FINDINGS: Here we present Messina, a method that can identify those genes that only sometimes show aberrant expression in cancer. We demonstrate with simulated data that Messina is highly sensitive and specific when used to identify genes which are aberrantly expressed in only a proportion of cancers, and compare Messina to contemporary analysis techniques. We illustrate Messina by using it to detect the aberrant expression of a gene that may play an important role in pancreatic cancer. CONCLUSIONS/SIGNIFICANCE: Messina allows the detection of genes with profiles typical of markers of molecular subtype, and complements existing methods to assist the identification of such markers. Messina is applicable to any global expression profiling data, and to allow its easy application has been packaged into a freely-available stand-alone software package.

  12. Design Analysis Rules to Identify Proper Noun from Bengali Sentence for Universal Networking language

    Directory of Open Access Journals (Sweden)

    Md. Syeful Islam

    2014-08-01

    Full Text Available Now-a-days hundreds of millions of people of almost all levels of education and attitudes from different country communicate with each other for different purposes and perform their jobs on internet or other communication medium using various languages. Not all people know all language; therefore it is very difficult to communicate or works on various languages. In this situation the computer scientist introduce various inter language translation program (Machine translation. UNL is such kind of inter language translation program. One of the major problem of UNL is identified a name from a sentence, which is relatively simple in English language, because such entities start with a capital letter. In Bangla we do not have concept of small or capital letters. Thus we find difficulties in understanding whether a word is a proper noun or not. Here we have proposed analysis rules to identify proper noun from a sentence and established post converter which translate the name entity from Bangla to UNL. The goal is to make possible Bangla sentence conversion to UNL and vice versa. UNL system prove that the theoretical analysis of our proposed system able to identify proper noun from Bangla sentence and produce relative Universal word for UNL.

  13. Space-Time Analysis to Identify Areas at Risk of Mortality from Cardiovascular Disease

    Directory of Open Access Journals (Sweden)

    Poliany C. O. Rodrigues

    2015-01-01

    Full Text Available This study aimed at identifying areas that were at risk of mortality due to cardiovascular disease in residents aged 45 years or older of the cities of Cuiabá and Várzea Grande between 2009 and 2011. We conducted an ecological study of mortality rates related to cardiovascular disease. Mortality rates were calculated for each census tract by the Local Empirical Bayes estimator. High- and low-risk clusters were identified by retrospective space-time scans for each year using the Poisson probability model. We defined the year and month as the temporal analysis unit and the census tracts as the spatial analysis units adjusted by age and sex. The Mann-Whitney U test was used to compare the socioeconomic and environmental variables by risk classification. High-risk clusters showed higher income ratios than low-risk clusters, as did temperature range and atmospheric particulate matter. Low-risk clusters showed higher humidity than high-risk clusters. The Eastern region of Várzea Grande and the central region of Cuiabá were identified as areas at risk of mortality due to cardiovascular disease in individuals aged 45 years or older. High mortality risk was associated with socioeconomic and environmental factors. More high-risk clusters were observed at the end of the dry season.

  14. Sequencing and bioinformatic analysis of mRNA from Echinococcus granulosus protoscolex%细粒棘球蚴原头节mRNA测序及生物信息学初步分析

    Institute of Scientific and Technical Information of China (English)

    朱明星; 王娅娜; 巨艳; 王志昇; 朱佳佳; 赵巍

    2014-01-01

    Objective To reveal the transcriptomic information and biological characteristics of Echinococcus granulosus protoscolex by using RNA-Seq technique and through bioinformatic analysis of sequencing data.Methods Total RNA was isolated from Echinococcus granulosus protoscolex using TRIzol.mRNA kit with Oligo(dT) magnetic beads was used to isolate poly(A) mRNA.Illumina HiSeq 2000 was applied for sequencing.Data from sequeced were assembled into unigene by using mapping-first approaches based on the genome which uploaded in the Wellcome Trust Sanger Institute.Alignment between unigenes and protein databases,such as NCBI non-redundant protein (Nr) database,UniProt database,the gene ontology (GO) database,the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database,was performed.Results A total of 132 007 609 clean reads and 91 342 unigenes were generated.The average length of unigene was 419 bp.Through analysed with GO,26 552 unigenes were mapped.Further biological process categories of GO prominently represented in biological process,molecular functions and cellular components including 48 categories.Six thousand six hundred and sixty-four unigenes were mapped to 227 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways,including 6 first level categories and 34 second level categories.Conclusion Extensive transcriptome data from sequencing of Echinococcus granulosus protoscolex were obtained.The information has provided a database for further study on the hydatid disease.%目的 利用RNA-seq技术对细粒棘球蚴原头节的转录组进行测序并对测序数据进行生物信息学分析,以揭示细粒棘球蚴原头节mRNA所包含的信息. 方法 用Trizol法提取原头节总RNA,分离mRNA,利用Illumina公司的HiSeq2000高通量测序仪对mRNA序列进行测序.参考英国桑格研究院(Wellcome Trust Sanger Institute)公布的细粒棘球绦虫基因组数据,对测序数据进行拼接组装,将获得的unigene与非冗余的蛋白序列数

  15. 马鹿生长激素(GH)基因生物信息学预测及分析%Bioinformatics Prediction and Analysis on GH Gene of Cervus elaphus

    Institute of Scientific and Technical Information of China (English)

    宋兴超; 杨福合; 刘汇涛; 徐超; 魏海军; 邢秀梅

    2012-01-01

    In order to study the structure and function of GH gene of wapiti, coding sequences(CDS) of GH gene in wapiti, sika deer, chevrotain, cattle, goat, sheep, pig, human, chimpanzee, Norway rat, house mouse, arctic fox, dog, chicken and zebrafish were downloaded from GenBank as experimental ma- terials. Bioinformaties analysis was made on basic information and encoding protein structure, physic- chemical property, signal peptide, transmembrane structure, generic phosphorylation sites, secondary structure and subcellular localization were predicted by means of biologic software and online tools. In ad- dition, the similarity of GH gene CDS sequence and amino acid between those of wapiti and other 14 species were also analyzed. Phylogentic tree of the homologous gene based on the amino acid of GH gene was constructed. The results showed that the length of wapiti GH gene was 2 100 bp, which included 5 exons, 4 introns, partial 5'UTR and 3'UTR, and it contained an open reading frame of 654 bp, which encoded 217 amino acids. The estimated molecular weight of GH protein was 24.588 4 ku, with a iso- electric point of 7.62 and 31.04 in stability index, belonging to the stable alkalinous protein with hy- drophobicity. The GH protein had two obvious strong transmembrane region, eight phosphorylation sites. The secondary structure of GH protein was mainly α-helix and irregular curly. The extracellular protein contained one signal peptide was probably being secreting type. The similarity comparison and phyloge- netic tree indicated that the evolution distance of wapiti GH gene was the most homogeneous to sika deer, chevrotain, cattle, goat and sheep. The research provided detailed bioinformatics information for further study on GH gene of wapiti.%为研究马鹿生长激素(GH)基因的结构和功能,从GenBank中下载马鹿、梅花鹿、鼷鹿、牛、山羊、绵羊、猪、人、黑猩猩、挪威大鼠、小家鼠、北极狐、狗、鸡和斑马鱼

  16. The 2015 Bioinformatics Open Source Conference (BOSC 2015.

    Directory of Open Access Journals (Sweden)

    Nomi L Harris

    2016-02-01

    Full Text Available The Bioinformatics Open Source Conference (BOSC is organized by the Open Bioinformatics Foundation (OBF, a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG before the annual Intelligent Systems in Molecular Biology (ISMB conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included "Data Science;" "Standards and Interoperability;" "Open Science and Reproducibility;" "Translational Bioinformatics;" "Visualization;" and "Bioinformatics Open Source Project Updates". In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled "Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community," that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule.

  17. The bioinformatics of next generation sequencing: a meeting report

    Institute of Scientific and Technical Information of China (English)

    Ravi Shankar

    2011-01-01

    @@ The Studio of Computational Biology & Bioinformatics (SCBB), IHBT, CSIR,Palampur, India organized one of the very first national workshop funded by DBT,Govt.of India, on the Bioinformatics issues associated with next generation sequencing approaches.The course structure was designed by SCBB, IHBT.The workshop took place in the IHBT premise on 17 and 18 June 2010.

  18. Generative Topic Modeling in Image Data Mining and Bioinformatics Studies

    Science.gov (United States)

    Chen, Xin

    2012-01-01

    Probabilistic topic models have been developed for applications in various domains such as text mining, information retrieval and computer vision and bioinformatics domain. In this thesis, we focus on developing novel probabilistic topic models for image mining and bioinformatics studies. Specifically, a probabilistic topic-connection (PTC) model…

  19. Evaluating an Inquiry-Based Bioinformatics Course Using Q Methodology

    Science.gov (United States)

    Ramlo, Susan E.; McConnell, David; Duan, Zhong-Hui; Moore, Francisco B.

    2008-01-01

    Faculty at a Midwestern metropolitan public university recently developed a course on bioinformatics that emphasized collaboration and inquiry. Bioinformatics, essentially the application of computational tools to biological data, is inherently interdisciplinary. Thus part of the challenge of creating this course was serving the needs and…

  20. Assessment of a Bioinformatics across Life Science Curricula Initiative

    Science.gov (United States)

    Howard, David R.; Miskowski, Jennifer A.; Grunwald, Sandra K.; Abler, Michael L.

    2007-01-01

    At the University of Wisconsin-La Crosse, we have undertaken a program to integrate the study of bioinformatics across the undergraduate life science curricula. Our efforts have included incorporating bioinformatics exercises into courses in the biology, microbiology, and chemistry departments, as well as coordinating the efforts of faculty within…