predicted metabolizable protein: Topics by WorldWideScience.org

Sample records for predicted metabolizable protein

Meta-analysis to predict the effects of metabolizable amino acids on dairy cattle performance.

Science.gov (United States)

Lean, I J; de Ondarza, M B; Sniffen, C J; Santos, J E P; Griswold, K E

2018-01-01

Meta-analytic methods were used to determine statistical relationships between metabolizable AA supplies and milk protein yield, milk protein percentage, and milk yield in lactating dairy cows. Sixty-three research publications (258 treatment means) were identified through a search of published literature using 3 search engines and met the criteria for inclusion in this meta-analysis. The Cornell Net Carbohydrate and Protein System (CNCPS) version 6.5 was used to determine dietary nutrient parameters including metabolizable AA. Two approaches were used to analyze the data. First, mixed models were fitted to determine whether explanatory variables predicted responses. Each mixed model contained a global intercept, a random intercept for each experiment, and data were weighted by the inverse of the SEM squared. The second analysis approach used classical effect size meta-analytical evaluation of responses to treatment weighted by the inverse of the treatment variance and with a random effect of treatment nested within experiment. Regardless of the analytical approach, CNCPS-predicted metabolizable Met (g/d) was associated with milk protein percentage and yield. Milk yield was positively associated with CNCPS-predicted metabolizable His, Leu, Trp, Thr, and nonessential AA (g/d). Milk true protein yield was also associated with CNCPS-predicted metabolizable Leu (g/d). Predicted metabolizable Lysine (g/d) did not increase responses in production outcomes. However, mean metabolizable Lys supply was less than typically recommended and the change with treatment was minimal (157 vs. 162 g; 6.36 vs. 6.38% metabolizable protein). Experiments based solely on Lys or Met interventions were excluded from the study database. It is possible that the inclusion of these experiments may have provided additional insight into the effect of these AA on responses. This meta-analysis supports other research indicating a positive effect of Met and His as co-limiting AA in dairy cows and
Truly Absorbed Microbial Protein Synthesis, Rumen Bypass Protein, Endogenous Protein, and Total Metabolizable Protein from Starchy and Protein-Rich Raw Materials

NARCIS (Netherlands)

Parand, Ehsan; Vakili, Alireza; Mesgaran, Mohsen Danesh; Duinkerken, Van Gert; Yu, Peiqiang

2015-01-01

This study was carried out to measure truly absorbed microbial protein synthesis, rumen bypass protein, and endogenous protein loss, as well as total metabolizable protein, from starchy and protein-rich raw feed materials with model comparisons. Predictions by the DVE2010 system as a more
Adjustment of equations to predict the metabolizable energy of corn for meat type quails

Directory of Open Access Journals (Sweden)

Tiago Junior Pasquetti

2015-08-01

Full Text Available The metabolizable energy (ME determination for foods used in quail diets, through metabolism assays, takes time, infrastructure and financial resources, which makes the development of prediction equations based on proximal composition of foods to estimate the ME values of particular interest. The objective of this study was to adjust the prediction equations of metabolizable energy (ME of corn for quail. The chemical compositions of 12 maize varieties were determined and a metabolism assay was carried out in order to determine the apparent metabolizable energy (AME and nitrogen-corrected apparent metabolizable energy (AMEn of these corn varieties. The values of chemical composition, AME and AMEn, converted to dry matter, were used to adjust the prediction equations. The initial adjustment of simple and multiple linear regression of the AME and AMEn was performed using the values of crude protein (CP, ether extract (EE, neutral (NDF and acid (ADF detergent fiber, mineral matter (MM, calcium (Ca and phosphorus (P as regressors (full model. To adjust the prediction equations the statistical procedure of simple and multiple linear regression was used, with the technique of indirect elimination (Backward. There was adjustment of 10 prediction equations, in which five were for AME and another five for AMEn, the R² values of which ranged from 0.20 to 0.75 and from 0.21 to 0.78, respectively. For all adjusted equations, negative correlations for MM were observed, which may be related to its dilutive effect of the gross energy contained in corn. In conclusion, the equations that showed better adjustment were AME= 5605.46 - 385.074CP + 111.648EE + 48.1133NDF + 303.924ADF - 929.931MM (R²= 0.75 and AMEn= 5878.16 - 403.937CP + 81.9618EE + 41.8954NDF + 303.506FDA - 901.621MM (R²= 0.78.
Metabolizable protein systems in ruminant nutrition: A review

Directory of Open Access Journals (Sweden)

Lalatendu Keshary Das

2014-08-01

Full Text Available Protein available to ruminants is supplied by both microbial and dietary sources. Metabolizable protein (MP is the true protein which is absorbed by the intestine and supplied by both microbial protein and protein which escapes degradation in the rumen; the protein which is available to the animal for maintenance, growth, fetal growth during gestation, and milk production. Thus, the concept of balancing ruminant rations basing on only dietary crude protein (CP content seems erroneous. In India, ruminant rations are still balanced for digestible CP and total digestible nutrients for protein and energy requirements, respectively. Traditional feed analysis methods such as proximate analysis and detergent analysis consider feed protein as a single unit and do not take into account of the degradation processes that occur in rumen and passage rates of feed fractions from rumen to intestine. Therefore, the protein requirement of ruminants should include not only the dietary protein source, but also the microbial CP from rumen. The MP systems consider both the factors, thus predict the protein availability more accurately and precisely. This system is aptly designed to represent the extent of protein degradation in the rumen and the synthesis of microbial protein as variable functions. Feed protein fractions, i.e., rumen degradable protein and rumen undegradable protein play vital roles in meeting protein requirements of rumen microbes and host animal, respectively. With the advent of sophisticated nutrition models such as Cornell net carbohydrate and protein system, National Research Council, Agricultural Research Council, Cornell Penn Miner Dairy and Amino Cow; ration formulation has moved from balancing diets from CP to MP, a concept that describes the protein requirements of ruminantsat intestinal level, and which is available to animals for useful purposes.
The evaluation of metabolizable protein content of some indigenous feedstuffs used in ruminant nutrition

Directory of Open Access Journals (Sweden)

Lalatendu Keshary Das

2014-04-01

Full Text Available Aim: To determine the metabolizable protein (MP content of common indigenous feedstuffs used in ruminant nutrition using in situ method. Materials and Methods: Nine ruminant feeds such as maize grain (MG, groundnut cake (GNC, mustard oilcake (MOC, cottonseed cake (CSC, deoiled rice bran (DORB, wheat bran (WB, berseem fodder (BF, maize fodder (MF and sorghum fodder (SF were included in this study. Each test feed was dried, ground and chemically analysed for proximate principles (DM, CP, EE, OM, Total ash, fiber fractions (NDF, ADF, cellulose, hemicellulose, lignin, NDICP and ADICP. Two adult fistulated bulls were used for evaluating the protein degradation characteristics of each test feed using the nylon bag method. Metabolizable energy (ME content of the test feeds were predicted from their chemical composition data using summative approach of NRC (2001 model. The equations of AFRC (1992 were used to predict the rumen degradable protein (RDP, digestible microbial protein (DMP, digestible undegraded feed protein (DUP and MP content of test feeds. Results: The MP content of MG, GNC, MOC, CSC, DORB, WB, BF, MF and SF was found to be 95.26, 156.41, 135.21, 125.06, 101.68, 107.11, 136.81, 72.01 and 76.65 g/kg DM, respectively. The corresponding ME (MJ/kg DM content of the test feeds was 13.66, 13.12, 13.65, 10.68, 9.08, 11.56, 9.64, 8.33 and 8.03, respectively. Among the test feeds, GNC contained the highest and MF contained the lowest MP per kg DM. Conclusion: It was concluded that the degradability of crude protein (CP of the test feeds can be used in MP determination and diet formulation. Feed CP content is not available as such at intestinal level in ruminants as a definite part of it undergoes extensive microbial degradation in rumen. The pattern and extent of such degradation do influence the amount of protein presented to lower digestive tract (MP for absorption and utilization in ruminants. It was also found that the MP content of a feed is
Application of Artificial Neural Network and Support Vector Machines in Predicting Metabolizable Energy in Compound Feeds for Pigs.

Science.gov (United States)

Ahmadi, Hamed; Rodehutscord, Markus

2017-01-01

In the nutrition literature, there are several reports on the use of artificial neural network (ANN) and multiple linear regression (MLR) approaches for predicting feed composition and nutritive value, while the use of support vector machines (SVM) method as a new alternative approach to MLR and ANN models is still not fully investigated. The MLR, ANN, and SVM models were developed to predict metabolizable energy (ME) content of compound feeds for pigs based on the German energy evaluation system from analyzed contents of crude protein (CP), ether extract (EE), crude fiber (CF), and starch. A total of 290 datasets from standardized digestibility studies with compound feeds was provided from several institutions and published papers, and ME was calculated thereon. Accuracy and precision of developed models were evaluated, given their produced prediction values. The results revealed that the developed ANN [ R 2 = 0.95; root mean square error (RMSE) = 0.19 MJ/kg of dry matter] and SVM ( R 2 = 0.95; RMSE = 0.21 MJ/kg of dry matter) models produced better prediction values in estimating ME in compound feed than those produced by conventional MLR ( R 2 = 0.89; RMSE = 0.27 MJ/kg of dry matter). The developed ANN and SVM models produced better prediction values in estimating ME in compound feed than those produced by conventional MLR; however, there were not obvious differences between performance of ANN and SVM models. Thus, SVM model may also be considered as a promising tool for modeling the relationship between chemical composition and ME of compound feeds for pigs. To provide the readers and nutritionist with the easy and rapid tool, an Excel ® calculator, namely, SVM_ME_pig, was created to predict the metabolizable energy values in compound feeds for pigs using developed support vector machine model.
Metabolizable energy levels for meat quails from 15 to 35 days of age

Directory of Open Access Journals (Sweden)

Jorge Cunha Lima Muniz

Full Text Available ABSTRACT: This trial was carried out to evaluate the effects of dietetic metabolizable energy levels on performance and carcass traits of meat quails from 15 to 35 days old. Five hundred sixty, 15-d old, meat quails were randomly assigned to five treatments (2.850; 2.950; 3.050; 3.150 e 3.250kcal of ME kg-1 of diet, with eight replicates and fourteen birds per experimental unit. Feed intake, protein and lysine intake and feed conversion decreased linearly as the metabolizable energy content of diets increased (P0.05 by the treatments. Diets did not influence (P>0.05 carcass traits as dry matter, moisture and protein content in carcass. However a quadratic effect (P<0.04 were observed on carcass fat content. Based on these results, the adequate metabolizable energy level to ensure better meat quails' growth is 3.250kcal of ME kg-1 diet, that corresponds to a metabolizable energy: crude protein ratio of 139,24.
The correlationship between the metabolizable energy content, chemical composition and color score in different sources of corn DDGS.

Science.gov (United States)

Jie, Yong-Z; Zhang, Jian-Y; Zhao, Li-H; Ma, Qiu-G; Ji, Cheng

2013-09-25

This study was conducted to evaluate the apparent metabolizable energy (AME) and true metabolizable energy (TME) contents in 30 sources of corn distillers dried grains with solubles (DDGS) in adult roosters, and establish the prediction equations to estimate the AME and TME value based on its chemical composition and color score. Twenty-eight sources of corn DDGS made from several processing plants in 11 provinces of China and others imported from the United States. DDGS were analyzed for their metabolizable energy (ME) contents, measured for color score and chemical composition (crude protein, crude fat, ash, neutral detergent fiber, acid detergent fiber), to predict the equation of ME in DDGS. A precision-fed rooster assay was used, each DDGS sample was tube fed (50 g) to adult roosters. The experiment was conducted as a randomized incomplete block design with 3 periods. Ninety-five adult roosters were used in each period, with 90 being fed the DDGS samples and 5 being fasted to estimate basal endogenous energy losses. Results showed that the AME ranged from 5.93 to 12.19 MJ/kg, TME ranged from 7.28 to 13.54 MJ/kg. Correlations were found between ME and ash content (-0.64, P sources energy digestibility and metabolizable energy content.
Relationship between protein molecular structural makeup and metabolizable protein supply to dairy cattle from new cool-season forage corn cultivars

Science.gov (United States)

Abeysekara, Saman; Khan, Nazir A.; Yu, Peiqiang

2018-02-01

Protein solubility, ruminal degradation and intestinal digestibility are strongly related to their inherent molecular makeup. This study was designed to quantitatively evaluate protein digestion in the rumen and intestine of dairy cattle, and estimate the content of truly metabolizable protein (MP) in newly developed cool-season forage corn cultivars. The second objective was to quantify protein inherent molecular structural characteristics using advance molecular spectroscopic technique (FT/IR-ATR) and correlate it to protein metabolic characteristics. Six new cool-season corn cultivars, including 3 Pioneer (PNR) and 3 Hyland (HL), coded as PNR-7443R, PNR-P7213R, PNR-7535R, HL-SR06, HL-SR22, HL-BAXXOS-RR, were evaluated in the present study. The metabolic characteristics, MP supply to dairy cattle, and energy synchronization properties were modeled by two protein evaluation models, namely, the Dutch DVE/OEB system and the NRC-2001 model. Both models estimated significant (P contents of microbial protein (MCP) synthesis and truly absorbable rumen undegraded protein (ARUP) among the cultivars. The NRC-2001 model estimated significant (P content and degraded protein balance (DPB) among the cultivars. The contents MCP, ARUP and MP were higher (P < 0.05) for cultivar HL-SR06, resulting in the lowest (P < 0.05) DPB. However, none of the cultivars reached the optimal target hourly effective degradability ratio [25 g N g/kg organic matter (OM)], demonstrating N deficiency in the rumen. There were non-significant differences among the cultivars in molecular-spectral intensities of protein. The amide I/II ratio had a significant correlation with ARUP (r = - 0.469; P < 0.001) and absorbable endogenous protein (AECPNRC) (P < 0.001; r = 0.612). Similarly, amide-II area had a weak but significant correlation (r = 0.299; P < 0.001) with RUP and ARUP, and with AECPNRC (P < 0.001; r = 0.411). Except total digestible nutrients and AECPNRC, the amide-I area did not show significant
Metabolizable energy, nitrogen balance, and ileal digestibility of amino acids in quality protein maize for pigs

Science.gov (United States)

2014-01-01

Background To compare the nutritional value and digestibility of five quality protein maize (QPM) hybrids to that of white and yellow maize, two experiments were carried out in growing pigs. In experiment 1, the energy metabolizability and the nitrogen balance of growing pigs fed one of five QPM hybrid diets were compared against those of pigs fed white or yellow maize. In experiment 2, the apparent and standardized ileal digestibility (AID and SID, respectively) of proteins and amino acids from the five QPM hybrids were compared against those obtained from pigs fed white and yellow maize. In both experiments, the comparisons were conducted using contrasts. Results The dry matter and nitrogen intakes were higher in the pigs fed the QPM hybrids (P digestibility (P digestible lysine than normal maize. PMID:25045520
Food processing and structure impact the metabolizable energy of almonds

Science.gov (United States)

The measured metabolizable energy (ME) of whole almonds has been shown to be less than predicted by Atwater factors. However, data are lacking on the effects of processing (roasting, chopping or grinding) on the ME of almonds. A 5-period randomized, crossover study in healthy individuals (n=18) was ...
Effect of thermal processing on estimated metabolizable protein supply to dairy cattle from camelina seeds: relationship with protein molecular structural changes.

Science.gov (United States)

Peng, Quanhui; Khan, Nazir A; Wang, Zhisheng; Zhang, Xuewei; Yu, Peiqiang

2014-08-20

This study evaluated the effect of thermal processing on the estimated metabolizable protein (MP) supply to dairy cattle from camelina seeds (Camelina sativa L. Crantz) and determined the relationship between heat-induced changes in protein molecular structural characteristics and the MP supply. Seeds from two camelina varieties were sampled in two consecutive years and were either kept raw or were heated in an autoclave (moist heating) or in an air-draft oven (dry heating) at 120 °C for 1 h. The MP supply to dairy cattle was modeled by three commonly used protein evaluation systems. The protein molecular structures were analyzed by Fourier transform/infrared-attenuated total reflectance molecular spectroscopy. The results showed that both the dry and moist heating increased the contents of truly absorbable rumen-undegraded protein (ARUP) and total MP and decreased the degraded protein balance (DPB). However, the moist-heated camelina seeds had a significantly higher (P seeds. The regression equations showed that intensities of the protein molecular structural bands can be used to estimate the contents of ARUP, MP, and DPB with high accuracy (R(2) > 0.70). These results show that protein molecular structural characteristics can be used to rapidly assess the MP supply to dairy cattle from raw and heat-treated camelina seeds.
Efeito de níveis de proteína bruta e de energia metabolizável na dieta sobre o desempenho de codornas de postura Dietary crude protein and metabolizable energy levels on laying quails performance

Directory of Open Access Journals (Sweden)

Almir Chalegre de Freitas

2005-06-01

Full Text Available Objetivou-se, neste experimento, avaliar o efeito de diferentes níveis de proteína bruta (PB e de energia metabolizável (EM sobre o desempenho de codornas de postura. Foram utilizadas 672 codornas japonesas (Coturnix coturnix japonica a partir de 42 dias de idade, durante 168 dias de produção, dividido em seis períodos de 28 dias cada, distribuídas em delineamento inteiramente casualizado, em arranjo fatorial de 4 x 4 (proteína x energia, com seis repetições de sete aves por unidade experimental. Os níveis avaliados foram: 16, 18, 20 e 22% de proteína bruta e 2.585, 2.685, 2.785 e 2.885 kcal de energia metabolizável/kg de ração. Não houve efeito significativo dos tratamentos sobre a ingestão de energia e a produção de ovos. Entretanto, o aumento do nível de energia da ração promoveu redução linear no consumo de ração, na ingestão diária de proteína bruta, no peso do ovo e na massa de ovos, enquanto o de proteína proporcionou aumento linear na ingestão diária de proteína bruta, na massa de ovos, na conversão alimentar e no ganho de peso corporal e efeito quadrático sobre o peso do ovo, sendo 21,16% o nível de proteína bruta estimado para a obtenção do máximo peso do ovo. Pode-se concluir que as codornas japonesas têm o consumo regulado em função do nível de energia da ração. Para se obter maior produção de ovos e melhor conversão alimentar, a ração de postura deve conter 18% de proteína bruta e 2.585 kcal de EM/kg. Entretanto, se o objetivo for a obtenção de ovos com peso mais elevado, o nível de proteína bruta da ração deve aumentar para 21,16%.This work was developed to evaluate the effect of different levels of crude protein (CP and metabolizable energy (ME on the performance of laying quails. Six hundred and seventy tywo Japanese quails (Coturnix coturnix japonica from 42 days to 168 days of age were divided in six periods of 28 days each. The birds were assigned to a completely
Desempenho produtivo e biometria das vísceras de codornas francesas alimentadas com diferentes níveis de energia metabolizável e proteína bruta - DOI: 10.4025/actascianimsci.v26i3.1810 Productive performance and biometrics of French quail viscera, fed on different levels of metabolizable energy and crude protein - DOI: 10.4025/actascianimsci.v26i3.1810

Directory of Open Access Journals (Sweden)

Concepta McManus

2004-04-01

Full Text Available Com o objetivo de avaliar o desempenho e a biometria de vísceras de codornas francesas na fase inicial (0 a 14 dias, 3.768 codornas com um dia de vida foram submetidas a dietas com diferentes níveis de proteína bruta e energia metabolizável. O delineamento utilizado foi inteiramente casualizado, fatorial 2x4, com dois níveis de energia metabolizável (2.900 e 3.000 kcal EM/kg, e quatro níveis de proteína bruta (20,5; 21,5; 22,5 e 23,5% e, três repetições de 157 codornas por unidade experimental. Aos sete dias, não foi observada diferença significativa nos parâmetros ganho de peso, consumo de ração e conversão alimentar; já aos 14 dias, verificou-se influência da energia metabolizável no consumo de ração, conversão alimentar e mortalidade. No estudo biométrico, o peso do pâncreas e o peso da moela apresentaram diferenças significativas aos sete dias, e aos 14 dias apenas o peso relativo do fígado foi influenciado pelos níveis de proteína bruta.The aim of the present experiment was to evaluate performance and biometrics of French quails viscera in initial phase (0 to 14 days. A total of 3,768 one day-old quails were submitted to diets in different levels of crude protein and metabolizable energy. The utilized design was entirely randomized in a 2x4 factorial, in two levels of metabolizable energy (2,900 and 3,000 kcal ME/kg, four levels of crude protein (20.5; 21.5; 22.5 and 23.5%, and three replications of 157 quails per experimental unit. Over seven days, no significant differences were verified in parameters of weight gain, feed intake and feed conversion. However, by 14 days, feed intake, feed conversion and mortality were influenced by the metabolizable energy. In the biometric study, pancreas and gizzard weight presented significant differences at seventh and fourteenth days during the treatments. Crude protein levels influenced liver weight.
Valor nutricional e energia metabolizável de subprodutos do trigo utilizados para alimentação de suínos em crescimento Nutritional value and metabolizable energy of wheat by‑products used for feeding growing pigs

Directory of Open Access Journals (Sweden)

William Rui Wesendonck

2013-02-01

Full Text Available O objetivo deste trabalho foi avaliar o valor nutricional e energético de subprodutos do trigo, em dietas para suínos em crescimento, e obter equações de predição da energia metabolizável. Foram utilizados 36 suínos machos, castrados, alojados em gaiolas metabólicas individuais. Realizou-se a coleta total de fezes e urina em dois períodos de dez dias: cinco para adaptação e cinco para coleta. Utilizou-se o delineamento de blocos ao acaso, tendo-se considerado o período de coleta como bloco, com seis tratamentos e seis repetições. A dieta referência foi substituída em 30% por um dos subprodutos testados: farinheta, farelo fino, farelo de trigo, farelo grosso e farelo grosso moído; este último usado para avaliar a influência da granulometria na digestibilidade. A fibra bruta foi a variável que proporcionou a melhor estimativa da energia metabolizável. O farelo fino foi superior em energia digestível e metabolizável, em comparação ao farelo grosso moído. O farelo grosso moído apresentou os menores coeficientes de digestibilidade, e a diminuição de seu diâmetro geométrico médio não aumentou a digestibilidade dos nutrientes e da energia. Entre os subprodutos avaliados, a farinheta apresenta maior energia digestível, energia metabolizável e proteína digestível, o que mostra elevado potencial para utilização em dietas para suínos em crescimento.The objective of this work was to evaluate nutritional and energy values of wheat by‑products in diets for growing pigs, and to obtain prediction equations for metabolizable energy. Thirty‑six male pigs were housed in individual metabolic cages. Total collection of feces and urine was carried out in two periods of ten days: five days for adaptation and five days for collection. A randomized complete block design was used, considering the sampling period as a block, with six treatments and six replicates. The reference diet was replaced by 30% of one of the tested by�
Avaliação de modelos de predição da energia metabolizável do milho para suínos Evaluating models to predict the metabolizable energy of maize for swine

Directory of Open Access Journals (Sweden)

R.N. Pelizzeri

2013-04-01

Full Text Available Os objetivos propostos no presente trabalho foram a validação da predição de modelos de regressão linear de 1º grau, dos valores estimados de energia metabolizável (EM em função dos valores observados de EM do milho, obtidos em ensaios biológicos com suínos. Setenta e quatro registros de composição química e energética do milho foram obtidos na literatura e utilizados para estimar a EM de 41 modelos de predição em função da composição química. A significância dos parâmetros (β0 e β1 da regressão foi avaliada pelo teste t parcial, e a validação da predição dos modelos de 1º grau foi obtida pela aceitação da hipótese de nulidade conjunta β0=0 e β1=1. Os modelos EM7 = 1,099 + 0,740EB - 5,5MM - 3,7FDN; EM9 = 16,13 - 9,5FDN + 16EE + 23PB*FDN - 138MM*FDN e EM13 = 5,42 - 17,2FDN - 19,4MM + 0,709EB são os mais adequados para estimar os valores de EM do milho e podem ser utilizados como ferramenta para formulação de rações para suínos.The proposed objective of this study was to validate the prediction of linear regression models of the first degree, the estimated values of metabolizable energy (ME regarding the observed ME values of maize obtained in biological assays with swine. Seventy four records of chemical and energetic composition of maize were obtained from literature and used to estimate the ME of 41 prediction models depending on the chemical composition. The significance of the regression parameters (β0and β1was evaluated by partial t test and the prediction validation of first degree models was obtained by accepting the null hypothesis β0=0 and β1=1. The ME7= 1.099 + 0.740GE - 5.5MM - 3.7NDF; ME9= 16.13 - 9.5NDF + 16EE + 23CP*NDF - 138MM* NDF and ME13= 5.42 - 17.2 NDF - 19.4MM + 0.709GE models are the most adequate to estimate the metabolizable energy of maize and can be used as a tool to formulate diets for swine.
Developing a computer-controlled simulated digestion system to predict the concentration of metabolizable energy of feedstuffs for rooster.

Science.gov (United States)

Zhao, F; Ren, L Q; Mi, B M; Tan, H Z; Zhao, J T; Li, H; Zhang, H F; Zhang, Z Y

2014-04-01

Four experiments were conducted to evaluate the effectiveness of a computer-controlled simulated digestion system (CCSDS) for predicting apparent metabolizable energy (AME) and true metabolizable energy (TME) using in vitro digestible energy (IVDE) content of feeds for roosters. In Exp. 1, the repeatability of the IVDE assay was tested in corn, wheat, rapeseed meal, and cottonseed meal with 3 assays of each sample and each with 5 replicates of the same sample. In Exp. 2, the additivity of IVDE concentration in corn, soybean meal, and cottonseed meal was tested by comparing determined IVDE values of the complete diet with values predicted from measurements on individual ingredients. In Exp. 3, linear models to predict AME and TME based on IVDE were developed with 16 calibration samples. In Exp. 4, the accuracy of prediction models was tested by the differences between predicted and determined values for AME or TME of 6 ingredients and 4 diets. In Exp. 1, the mean CV of IVDE was 0.88% (range = 0.20 to 2.14%) for corn, wheat, rapeseed meal, and cottonseed meal. No difference in IVDE was observed between 3 assays of an ingredient, indicating that the IVDE assay is repeatable under these conditions. In Exp. 2, minimal differences (<21 kcal/kg) were observed between determined and calculated IVDE of 3 complete diets formulated with corn, soybean meal, and cottonseed meal, demonstrating that the IVDE values are additive in a complete diet. In Exp. 3, linear relationships between AME and IVDE and between TME and IVDE were observed in 16 calibration samples: AME = 1.062 × IVDE - 530 (R(2) = 0.97, residual standard deviation [RSD] = 146 kcal/kg, P < 0.001) and TME = 1.050 × IVDE - 16 (R(2) = 0.97, RSD = 148 kcal/kg, P < 0.001). Differences of less than 100 kcal/kg were observed between determined and predicted values in 10 and 9 of the 16 calibration samples for AME and TME, respectively. In Exp. 4, differences of less than 100 kcal/kg between determined and predicted
Net protein and metabolizable protein requirements for maintenance and growth of early-weaned Dorper crossbred male lambs

Institute of Scientific and Technical Information of China (English)

Tao Ma; Kaidong Deng; Yan Tu; Naifeng Zhang; Bingwen Si; Guishan Xu; Qiyu Diao

2017-01-01

Background:Dorper is an important breed for meat purpose and widely used in the livestock industry of the world.However,the protein requirement of Dorper crossbred has not been investigated.The current paper reports the net protein (NP) and metabolizable protein (MP) requirements of Dorper crossbred ram lambs from 20 to 35 kg BW.Methods:Thirty-five Dorper × thin-tailed Han crossbred lambs weaned at approximately 50 d of age (20.3 ± 2.15 kg of BW) were used.Seven lambs of 25 kg BW were slaughtered as the baseline animals at the start of the trial.An intermediate group of seven randomly selected lambs fed ad libitum was slaughtered at 28.6 kg BW.The remaining 21 lambs were randomly divided into three levels of dry matter intake:ad libitum or 70％ or 40％ of ad libitum intake.Those lambs were slaughtered when the lambs fed ad libitum reached 35 kg BW.Total body N and N retention were measured.Results:The daily NP and MP requirements for maintenance were 1.89 and 4.52 g/kg metabolic shrunk BW (SBW0.75).The partial efficiency of MP utilization for maintenance was 0.42.The NP requirement for growth ranged from 12.1 to 43.5 g/d,for the lambs gaining 100 to 350 g/d,and the partial efficiency of MP utilization for growth was 0.86.Conclusions:The NP and MP requirements for the maintenance and growth of Dorper crossbred male lambs were lower than the recommendations of American and British nutritional systems.
Food processing and structure impact the metabolizable energy of almonds

OpenAIRE

Gebauer, SK; Novotny, JA; Bornhorst, GM; Baer, DJ

2016-01-01

© 2016 The Royal Society of Chemistry. The measured metabolizable energy (ME) of whole almonds has been shown to be less than predicted by Atwater factors. However, data are lacking on the effects of processing (roasting, chopping or grinding) on the ME of almonds. A 5-period randomized, crossover study in healthy individuals (n = 18) was conducted to measure the ME of different forms of almonds (42 g per day), as part of a controlled diet: whole, natural almonds; whole, roasted almonds; chop...
Protein docking prediction using predicted protein-protein interface

Directory of Open Access Journals (Sweden)

Li Bin

2012-01-01

Full Text Available Abstract Background Many important cellular processes are carried out by protein complexes. To provide physical pictures of interacting proteins, many computational protein-protein prediction methods have been developed in the past. However, it is still difficult to identify the correct docking complex structure within top ranks among alternative conformations. Results We present a novel protein docking algorithm that utilizes imperfect protein-protein binding interface prediction for guiding protein docking. Since the accuracy of protein binding site prediction varies depending on cases, the challenge is to develop a method which does not deteriorate but improves docking results by using a binding site prediction which may not be 100% accurate. The algorithm, named PI-LZerD (using Predicted Interface with Local 3D Zernike descriptor-based Docking algorithm, is based on a pair wise protein docking prediction algorithm, LZerD, which we have developed earlier. PI-LZerD starts from performing docking prediction using the provided protein-protein binding interface prediction as constraints, which is followed by the second round of docking with updated docking interface information to further improve docking conformation. Benchmark results on bound and unbound cases show that PI-LZerD consistently improves the docking prediction accuracy as compared with docking without using binding site prediction or using the binding site prediction as post-filtering. Conclusion We have developed PI-LZerD, a pairwise docking algorithm, which uses imperfect protein-protein binding interface prediction to improve docking accuracy. PI-LZerD consistently showed better prediction accuracy over alternative methods in the series of benchmark experiments including docking using actual docking interface site predictions as well as unbound docking cases.

Protein docking prediction using predicted protein-protein interface.

Science.gov (United States)

Li, Bin; Kihara, Daisuke

2012-01-10

Many important cellular processes are carried out by protein complexes. To provide physical pictures of interacting proteins, many computational protein-protein prediction methods have been developed in the past. However, it is still difficult to identify the correct docking complex structure within top ranks among alternative conformations. We present a novel protein docking algorithm that utilizes imperfect protein-protein binding interface prediction for guiding protein docking. Since the accuracy of protein binding site prediction varies depending on cases, the challenge is to develop a method which does not deteriorate but improves docking results by using a binding site prediction which may not be 100% accurate. The algorithm, named PI-LZerD (using Predicted Interface with Local 3D Zernike descriptor-based Docking algorithm), is based on a pair wise protein docking prediction algorithm, LZerD, which we have developed earlier. PI-LZerD starts from performing docking prediction using the provided protein-protein binding interface prediction as constraints, which is followed by the second round of docking with updated docking interface information to further improve docking conformation. Benchmark results on bound and unbound cases show that PI-LZerD consistently improves the docking prediction accuracy as compared with docking without using binding site prediction or using the binding site prediction as post-filtering. We have developed PI-LZerD, a pairwise docking algorithm, which uses imperfect protein-protein binding interface prediction to improve docking accuracy. PI-LZerD consistently showed better prediction accuracy over alternative methods in the series of benchmark experiments including docking using actual docking interface site predictions as well as unbound docking cases.
Digestibility Is Similar between Commercial Diets That Provide Ingredients with Different Perceived Glycemic Responses and the Inaccuracy of Using the Modified Atwater Calculation to Calculate Metabolizable Energy

Science.gov (United States)

Asaro, Natalie J.; Guevara, Marcial A.; Berendt, Kimberley; Zijlstra, Ruurd; Shoveller, Anna K.

2017-01-01

Dietary starch is required for a dry, extruded kibble; the most common diet type for domesticated felines in North America. However, the amount and source of dietary starch may affect digestibility and metabolism of other macronutrients. The objectives of this study were to evaluate the effects of 3 commercial cat diets on in vivo and in vitro energy and macronutrient digestibility, and to analyze the accuracy of the modified Atwater equation. Dietary treatments differed in their perceived glycemic response (PGR) based on ingredient composition and carbohydrate content (34.1, 29.5, and 23.6% nitrogen-free extract for High, Medium, and LowPGR, respectively). A replicated 3 × 3 Latin square design was used, with 3 diets and 3 periods. In vivo apparent protein, fat, and organic matter digestibility differed among diets, while apparent dry matter digestibility did not. Cats were able to efficiently digest and absorb macronutrients from all diets. Furthermore, the modified Atwater equation underestimated measured metabolizable energy by approximately 12%. Thus, the modified Atwater equation does not accurately determine the metabolizable energy of high quality feline diets. Further research should focus on understanding carbohydrate metabolism in cats, and establishing an equation that accurately predicts the metabolizable energy of feline diets. PMID:29117110
Digestibility Is Similar between Commercial Diets That Provide Ingredients with Different Perceived Glycemic Responses and the Inaccuracy of Using the Modified Atwater Calculation to Calculate Metabolizable Energy

Directory of Open Access Journals (Sweden)

Natalie J. Asaro

2017-11-01

Full Text Available Dietary starch is required for a dry, extruded kibble; the most common diet type for domesticated felines in North America. However, the amount and source of dietary starch may affect digestibility and metabolism of other macronutrients. The objectives of this study were to evaluate the effects of 3 commercial cat diets on in vivo and in vitro energy and macronutrient digestibility, and to analyze the accuracy of the modified Atwater equation. Dietary treatments differed in their perceived glycemic response (PGR based on ingredient composition and carbohydrate content (34.1, 29.5, and 23.6% nitrogen-free extract for High, Medium, and LowPGR, respectively. A replicated 3 × 3 Latin square design was used, with 3 diets and 3 periods. In vivo apparent protein, fat, and organic matter digestibility differed among diets, while apparent dry matter digestibility did not. Cats were able to efficiently digest and absorb macronutrients from all diets. Furthermore, the modified Atwater equation underestimated measured metabolizable energy by approximately 12%. Thus, the modified Atwater equation does not accurately determine the metabolizable energy of high quality feline diets. Further research should focus on understanding carbohydrate metabolism in cats, and establishing an equation that accurately predicts the metabolizable energy of feline diets.
Efficiency of metabolizable energy utilization for maintenance and gain and evaluation of Small Ruminant Nutrition System model in Santa Ines sheep

Directory of Open Access Journals (Sweden)

José Gilson Louzada Regadas Filho

2011-11-01

Full Text Available This study was carried out to estimate efficiencies of the utilization of metabolizable energy for maintenance (k m and weight gain (k g and to evaluate the Small Ruminant Nutrition System (SRNS model in predicting dry matter intake and average daily gain of growing Santa Ines sheep. Twenty-four non-castrated Santa Ines sheep, at 50 days of age and with average body weight of 13.00 ± 0.56 kg, respectively, were used. After a 10-day adaptation period, four animals were slaughtered to be used as reference for estimating initial empty body weight and body composition of the other animals. The remaining animals were distributed in a random block design, with the treatments consisting of diets containing different levels of metabolizable energy (2.08, 2.28, 2.47 and 2.69 Mcal/kg of DM, with five replicates. The metabolizable energy use efficiencies for maintenance and for weight gain were calculated from the relationship between the dietary net energy for maintenance and gain and ME concentration in the diets. Evaluation of the SRNS model was performed by adjustment of simple linear regression model between the predicted (independent variable and observed (dependent variable values. The estimated energy use efficiency for maintenance (k m was 0.70; and for gain weight (kg it showed to be inversely proportional to the increase of metabolizable energy concentration in the diet. The dry matter intake predicted by the SRNS model did not statistically differ from that observed, but the model overestimated the average daily gain by 5.18%. Those results can contribute to the construction of a database, which could be condensed into several others in a predictive model of performance and feed planning for sheep reared in Brazil.
Prediction of Digestible and Metabolizable Energy Content of Rice Bran Fed to Growing Pigs

Directory of Open Access Journals (Sweden)

C. X. Shi

2015-05-01

Full Text Available Two experiments were conducted to determine the digestible energy (DE and metabolizable energy (ME content of 19 rice bran samples and to develop prediction equations for DE and ME based on their chemical composition. The 19 rice bran samples came from different rice varieties, processing methods and regions. The basal diet was formulated using corn and soybean meal (74.43% corn and 22.91% soybean meal and 2.66% vitamins and minerals. The 19 experimental diets based on a mixture of corn, soybean meal and 29.2% of each source of rice bran, respectively. In Exp. 1, 108 growing barrows (32.1±4.2 kg were allotted to 1 of 18 treatments according to a completely randomized design with 6 pigs per treatment. The treatment 1 was the control group which was fed with basal diet. The treatments 2 to 18 were fed with experimental diets. In Exp. 2, two additional rice bran samples were measured to verify the prediction equations developed in Exp. 1. A control diet and two rice bran diets were fed to 18 growing barrows (34.6±3.5 kg. The control and experimental diets formulations were the same as diets in Exp. 1. The results showed that the DE ranged from 14.48 to 16.85 (mean 15.84 MJ/kg of dry matter while the ME ranged from 12.49 to 15.84 (mean 14.31 MJ/kg of dry matter. The predicted values of DE and ME of the two additional samples in Exp. 2 were very close to the measured values.
Níveis de energia metabolizável para frangos de corte de 1 a 21 dias de idade mantidos em ambiente de alta temperatura Metabolizable energy levels for broiler chicks from 1 to 21 days of age under high environmental temperature

Directory of Open Access Journals (Sweden)

Rita Flávia Miranda de Oliveira

2000-06-01

, metabolizable energy intake, and protein and fat depositions in the carcass increased while the feed:gain ratio of the chicks linearly decreased with the treatments. The carcass yield of the birds was not influenced by the dietary ME levels. The dietary ME levels affected the carcass composition and increased the abdominal fat weight of the broiler chicks. The broilers chicks from 1 to 21 days of age, kept under high environmental temperature, require at least an energy:protein ratio of 13.6:1 for better performance and protein deposition in carcass.
Eficiência de utilização da energia metabolizável para ganho de peso e exigências de energia metabolizável e nutrientes digestíveis totais de bovinos F1 Simental x Nelore Efficiency of metabolizable energy utilization for weight gain and requirements of metabolizable energy and total digestible nutrients in F1 Simental x Nellore bulls

Directory of Open Access Journals (Sweden)

Marcelo de Andrade Ferreira

1999-04-01

Full Text Available O objetivo deste trabalho foi estimar a eficiência de utilização da energia metabolizável (EM, para ganho de peso, e as exigências de energia metabolizável e nutrientes digestíveis totais de bovinos F1 Simental x Nelore, não-castrados, alimentados com rações contendo diferentes níveis de concentrado. Foram utilizados 29 animais com, em média, idade de 17 meses e peso vivo inicial de 354 kg. Cinco animais foram abatidos ao início do experimento, como referência, e o restante foi alimentado à vontade e distribuído nos tratamentos, de forma inteiramente casualizada, de acordo com o nível de concentrado na ração: 25; 37,5; 50; 62,5; e 75%. Os animais foram abatidos quando atingiram o peso de 500 kg. As concentrações de energia líquida das rações foram calculadas e as eficiências de utilização da energia metabolizável para ganho de peso foram estimadas, por análise de regressão, entre as energias líquidas para ganho, em relação à energia metabolizável (EM das rações. As exigências de EM para ganho de um quilograma de peso de corpo vazio aumentaram, à medida que se elevou o peso corporal dos animais e diminuíram, para mesmo peso vivo, à medida que se elevaram os níveis de concentrado nas rações. Estimaram-se as eficiências de utilização da energia metabolizável para ganho de peso em: 0,27; 0,26; 0,36; 0,39; e 0,42. O nível de concentrado melhorou a eficiência de utilização da EM para ganho de peso.The objective of this work was to estimate the efficiency of metabolizable energy utilization (ME for weight gain, and the requirements of metabolizable energy and total digestible nutrients in F1 Simental x Nellore bulls fed diets containing different concentrate levels. Twenty-nine animals averaging 17 of age and initial live weight of 354 kg were used. Five animals were slaughtered in the beginning of the experiment, as a reference, and the remainders were full fed and allotted to a completely randomized
Improved prediction of meat and bone meal metabolizable energy content for ducks through in vitro methods.

Science.gov (United States)

Garcia, R A; Phillips, J G; Adeola, O

2012-08-01

Apparent metabolizable energy (AME) of meat and bone meal (MBM) for poultry is highly variable, but impractical to measure routinely. Previous efforts at developing an in vitro method for predicting AME have had limited success. The present study uses data from a previous publication on the AME of 12 MBM samples, determined using 288 White Pekin ducks, as well as composition data on these samples. Here, we investigate the hypothesis that 2 noncompositional attributes of MBM, particle size and protease resistance, will have utility in improving predictions of AME based on in vitro measurements. Using the same MBM samples as the previous study, 2 measurements of particle size were recorded and protease resistance was determined using a modified pepsin digestibility assay. Analysis of the results using a stepwise construction of multiple linear regression models revealed that the measurements of particle size were useful in building models for AME, but the measure of protease resistance was not. Relatively simple (4-term) and complex (7-term) models for both AME and nitrogen-corrected AME were constructed, with R-squared values ranging from 0.959 to 0.996. The rather minor analytical effort required to conduct the measurements involved is discussed. Although the generality of the results are limited by the number of samples involved and the species used, they suggest that AME for poultry can be accurately predicted through simple and inexpensive in vitro methods.
Nutrient Digestibility and Metabolizable Energy Content of Mucuna pruriens Whole Pods Fed to Growing Pelibuey Lambs.

Science.gov (United States)

Loyra-Tzab, Enrique; Sarmiento-Franco, Luis Armando; Sandoval-Castro, Carlos Alfredo; Santos-Ricalde, Ronald Herve

2013-07-01

The nutrient digestibility, nitrogen balance and in vivo metabolizable energy supply of Mucuna pruriens whole pods fed to growing Pelibuey lambs was investigated. Eight Pelibuey sheep housed in metabolic crates were fed increasing levels of Mucuna pruriens pods: 0 (control), 100 (Mucuna100), 200 (Mucuna200) and 300 (Mucuna300) g/kg dry matter. A quadratic (pMucuna100 and Mucuna200 treatments. Increasing M. pruriens in the diets had no effect (p>0.05) on DM and GE apparent digestibility (pmucuna pod level. This effect was accompanied by a quadratic effect (pMucuna100 and Mucuna200 treatments. Urine-N excretion, GE retention and dietary estimated nutrient supply (metabolizable protein and metabolizable energy) were not affected (p>0.05). DM, N and GE apparent digestibility coefficient of M. pruriens whole pods obtained through multiple regression equations were 0.692, 0.457, 0.654 respectively. In vivo DE and ME content of mucuna whole pod were estimated in 11.0 and 9.7 MJ/kg DM. It was concluded that whole pods from M. pruriens did not affect nutrient utilization when included in an mixed diet up to 200 g/kg DM. This is the first in vivo estimation of mucuna whole pod ME value for ruminants.
Digestibility, Determination of Metabolizable Energy and Bone Mineralization of Broilers Fed with Nutritionally Valued Phytase

Directory of Open Access Journals (Sweden)

FH Litz

Full Text Available ABSTRACT The objective of this study was to evaluate the effect of using exoenzyme phytase in broiler's diets on digestibility of nutrients, feed energy and tibia bone mineralization. A completely randomized design was used, with the following treatments: sorghum with dicalcium phosphate (SDP, corn with dicalcium phosphate (CDP, sorghum with meat and bone meal (SMBM, sorghum with valued phytase (SVP and sorghum with phytase without valued (SPWV. For digestibility analysis, eighty 15 day old broilers were used, a total of 1400 male Hubbard Flex chickens, which were submitted to total excreta collection to obtain the percentages of food digestibility, crude protein, ether extract, apparent metabolizable energy, calcium and phosphorus while for tibias mineralization. Six birds per treatment were used, where determination of mineral matter, calcium and phosphorus were performed. Metabolizable energy (ME and apparent metabolizable energy corrected for nitrogen (AMEn of the feed were also calculated. Data were subjected to variation analysis and the average compared by 5% Tukey test. There was no difference between treatments for the digestibility at 15-20 day old as well as for the feed energy values, but the diets with phytase had higher phosphorous percentage values for tibia bone mineralization, demonstrating that exogenous phytase enzyme is able to hydrolyze phytate origininated from plant and release the phosphorus for assimilation by animals, acting as a substitute for phosphorus plant sources.
Relationship between ruminal ammonia and non-protein nitrogen utilization by ruminants

International Nuclear Information System (INIS)

Satter, L.D.; Roffler, R.E.

1976-01-01

Non-protein nitrogen (NPN) may be utilized as well as plant protein when ruminal ammonia nitrogen concentration is low ( 3 -N at 5 mg/100 ml will provide considerably less metabolizable protein, and the amount of metabolizable protein will be directly proportional to the amount of protein that escapes degradation. A simplified scheme for estimating metabolizable protein is presented. It has the flexibility needed for accommodating different feedstuffs, yet is easy to apply. The proposed scheme is based upon ruminal ammonia concentration, which in turn reflects protein intake, ration fermentability and protein degradation, the major determinants of protein supply to the lower intestine. It has the potential of more accurately describing the nutritional value of dietary crude protein, particularly if both protein and NPN are in the diet. (author)
Alternative prediction methods of protein and energy evaluation of pig feeds.

Science.gov (United States)

Święch, Ewa

2017-01-01

Precise knowledge of the actual nutritional value of individual feedstuffs and complete diets for pigs is important for efficient livestock production. Methods of assessment of protein and energy values in pig feeds have been briefly described. In vivo determination of protein and energy values of feeds in pigs are time-consuming, expensive and very often require the use of surgically-modified animals. There is a need for more simple, rapid, inexpensive and reproducible methods for routine feed evaluation. Protein and energy values of pig feeds can be estimated using the following alternative methods: 1) prediction equations based on chemical composition; 2) animal models as rats, cockerels and growing pigs for adult animals; 3) rapid methods, such as the mobile nylon bag technique and in vitro methods. Alternative methods developed for predicting the total tract and ileal digestibility of nutrients including amino acids in feedstuffs and diets for pigs have been reviewed. This article focuses on two in vitro methods that can be used for the routine evaluation of amino acid ileal digestibility and energy value of pig feeds and on factors affecting digestibility determined in vivo in pigs and by alternative methods. Validation of alternative methods has been carried out by comparing the results obtained using these methods with those acquired in vivo in pigs. In conclusion, energy and protein values of pig feeds may be estimated with satisfactory precision in rats and by the two- or three-step in vitro methods providing equations for the calculation of standardized ileal digestibility of amino acids and metabolizable energy content. The use of alternative methods of feed evaluation is an important way for reduction of stressful animal experiments.
Evaluation of the effects of rations with different levels of metabolizable energy on performance of laying hens

Directory of Open Access Journals (Sweden)

2008-05-01

Full Text Available This study was conducted to evaluate the effects of rations with different levels of metabolizable energy on performance of laying hens. The experiment was conducted on 25 laying hens of the commercial high line W-36 strain with 4 treatments and 4 replicates (16 laying hens in each replicate in a completely randomized design. Treatments included: (1 diet with amount of metabolizable energy recommended by NRC in 1994 (as control group, (2 diet with 10% higher level of metabolizable energy than that recommended by NRC in 1994, (3 diet with 10% lower level of metabolizable energy than that recommended by NRC in 1994 and (4 diet with 15% lower level of metabolizable energy than that recommended by NRC in 1994 which were fed for 10 weeks (from the age of 41 to 51 weeks to the laying hens. The results demonstrated that the amount of feed intake was significantly different among treatments (p
Genetic background in partitioning of metabolizable energy efficiency in dairy cows.

Science.gov (United States)

Mehtiö, T; Negussie, E; Mäntysaari, P; Mäntysaari, E A; Lidauer, M H

2018-05-01

The main objective of this study was to assess the genetic differences in metabolizable energy efficiency and efficiency in partitioning metabolizable energy in different pathways: maintenance, milk production, and growth in primiparous dairy cows. Repeatability models for residual energy intake (REI) and metabolizable energy intake (MEI) were compared and the genetic and permanent environmental variations in MEI were partitioned into its energy sinks using random regression models. We proposed 2 new feed efficiency traits: metabolizable energy efficiency (MEE), which is formed by modeling MEI fitting regressions on energy sinks [metabolic body weight (BW 0.75 ), energy-corrected milk, body weight gain, and body weight loss] directly; and partial MEE (pMEE), where the model for MEE is extended with regressions on energy sinks nested within additive genetic and permanent environmental effects. The data used were collected from Luke's experimental farms Rehtijärvi and Minkiö between 1998 and 2014. There were altogether 12,350 weekly MEI records on 495 primiparous Nordic Red dairy cows from wk 2 to 40 of lactation. Heritability estimates for REI and MEE were moderate, 0.33 and 0.26, respectively. The estimate of the residual variance was smaller for MEE than for REI, indicating that analyzing weekly MEI observations simultaneously with energy sinks is preferable. Model validation based on Akaike's information criterion showed that pMEE models fitted the data even better and also resulted in smaller residual variance estimates. However, models that included random regression on BW 0.75 converged slowly. The resulting genetic standard deviation estimate from the pMEE coefficient for milk production was 0.75 MJ of MEI/kg of energy-corrected milk. The derived partial heritabilities for energy efficiency in maintenance, milk production, and growth were 0.02, 0.06, and 0.04, respectively, indicating that some genetic variation may exist in the efficiency of using
Proper expression of metabolizable energy in avian energetics

Science.gov (United States)

Miller, M.R.; Reinecke, K.J.

1984-01-01

We review metabolizable energy (ME) concepts and present evidence suggesting that the form of ME used for analyses of avian energetics can affect interpretation of results. Apparent ME (AME) is the most widely used measure of food energy available to birds. True ME(TME) differs from AME in recognizing fecal and urinary energy of nonfood origin as metabolized energy. Only AME values obtained from test birds fed at maintenance levels should be used for energy analyses. A practical assay for TME has shown that TME estimates are less sensitive than AME to variation in food intake. The TME assay may be particularly useful in studies of natural foods that are difficult to obtain in quantities large enough to supply test birds with maintenance requirements. Energy budgets calculated from existence metabolism should be expressed as kJ of AME and converted to food requirements with estimates of metabolizability given in kJ AME/g.
Metabolizable energy levels for semi-heavy laying hens at the second production cycle Níveis de energia metabolizável para poedeiras semipesadas no segundo ciclo de produção

Directory of Open Access Journals (Sweden)

Fernando Guilherme Perazzo Costa

2009-05-01

Full Text Available This study was carried out to evaluate the energy levels in the diet to obtain better performance rates and quality of eggs from laying hens in the second production cycle. One hundred and eighty Bovans Goldline laying hens with 62 weeks of age were used during four 28-day periods. A completely randomized experimental design was used with four metabolizable energy levels (2,650, 2,725, 2,800, 2,875 and 2,950 kcal/kg, each with six replicates of six birds. The energy level of diet did not affect the weight of the egg, yolk, albumen and eggshell, the percentages of yolk, albumen and eggshell, yolk color and egg specific gravity. Feed intake, egg production, egg mass and feed conversion per egg mass and per dozen eggs increased significantly with increasing levels of metabolizable energy. Feed intake decreased linearly as the energy level in the diet increased. The metabolizable energy levels showed a quadratic effect on egg production, egg mass and feed conversion per egg mass and per dozen eggs. The metabolizable energy level of 2,830 kcal/kg was the most appropriate to promote better performance and quality of eggs from laying hens in the second production cycle.Este estudo foi realizado com o objetivo de avaliar níveis energéticos na dieta para obtenção de melhores índices de desempenho e qualidade de ovos de poedeiras de segundo ciclo de produção. Utilizaram-se 180 poedeiras com 62 semanas de idade, da linhagem Bovans Goldline, durante quatro períodos de 28 dias. O delineamento utilizado foi o inteiramente casualizado, com quatro níveis de energia metabolizável (2.650, 2.725, 2.800, 2.875 e 2.950 kcal/kg de ração, cada um com seis repetições de seis aves. O nível energético da dieta não influenciou os pesos do ovo, da gema, do albúmen e da casca, as porcentagens de gema, albúmen e casca, a coloração da gema e a gravidade específica do ovo. O consumo de ração, a produção de ovos, a massa do ovo e a conversão por massa
The influence of Aspergillus niger inoculum dosage on nutritive value and metabolizable energy of apu-apu meal (Pistia stratiotes L.) on broiler chicken

Science.gov (United States)

Gloria, J.; Tafsin, M.; Hanafi, N. D.; Daulay, A. H.

2018-02-01

Apu-apu lives at tropical and subtropical fresh waterways. The apu-apu meals ultization as feed still limited. The problem of ultization apu-apu meals as ingredients is a high crude fiber and need a treatment to decrease crude fiber. This study aim to find out the influence of Aspergillus niger inoculums dosage on apu-apu meal (Pistia stratiotes L.) on metabolizable energy on broiler chicken. This research used completely randomize design (CRD). The treatments consists of Aspergillus niger inoculum dosage (CFU/g) such as P0 (0), P1 (104 CFU/g), P2 (106 CFU/g), and P3 (108 CFU/g). The variable were observed : apparent metabolizable energy (AME), true metabolizable energy (TME), apparent metabolizable energy nitrogen corrected (AMEn) and true metabolizable energy nitrogen corrected (TMEn).The results showed that the dosage of Aspergillus niger increase nutritive value of Aspergillus niger. Dosage of Aspergillus niger also influence (P<0.05) metabolizable energy of apu-apu meals. Dosage 108 CFU/g had metabolizable energy significantly higher than other treatments. Conclusion of this research is the Aspergillus niger at the dosage 108 CFU/g increased nutritive value and metabolizable energy of apu-apu meal.
Parâmetros ruminais e síntese de proteína metabolizável em bovinos de corte sob suplementação com proteinados contendo diversos níveis de proteína bruta Ruminal fermentation characteristics and protein fraction effects on metabolizable protein synthesis of beef cattle fed different levels of crude protein

Directory of Open Access Journals (Sweden)

Luiz Orcirio Fialho de Oliveira

2009-12-01

Full Text Available Avaliaram-se os efeitos dos níveis de nitrogênio de suplementos proteicos sobre as concentrações de nitrogênio amoniacal (N-NH3 e ácidos graxos voláteis (AGV e o pH em bovinos de corte em pastagem de capim-marandu (Brachiaria brizantha, cv. Marandu. Foram realizadas estimativas da síntese microbiana, do aporte de proteína nãodegradável no rúmen (PNDR e proteína endógena e das suas contribuições no pool de proteína metabolizável (PM. Quatro bovinos Nelore com 395 ± 9 kg, fistulados no rúmen, foram utilizados nas medidas dos parâmetros ruminais e nas avaliações da degradabilidade, da cinética ruminal e das estimativas de síntese microbiana em um delineamento quadrado latino 4 ×4. Suplementos com 30, 40 ou 50% de proteína bruta (PB foram fornecidos na quantidade de 400 g/animal.dia para comparação a um grupo controle, sem suplementação proteica. Os animais foram mantidos em pastagens de Brachiaria brizantha, cv. Marandu, distribuídos em quatro piquetes com área de 1,0 ha cada, com oferta do suplemento e retirada das sobras, realizada diariamente. As concentrações de N-NH3 nos animais que receberam o suplemento com 50% PB foram superiores às observadas naqueles sob suplementação com 40% PB e no grupo controle, mas foram semelhantes às observadas no grupo sob suplementação com 30% PB. As concentrações de AGV no grupo sob suplementação com 30% PB foram superiores às observadas no grupo controle e semelhantes às obtidas com suplementação com 40 e 50% PB. O pH não diferiu entre os grupos. A estimativa de oferta de proteína microbiana e de PNDR foi maior para os animais sob suplementação com proteína em relação à observada no grupo controle.The effects of nitrogen levels of protein supplements were evaluated on the concentrations of ammonical nitrogen (N-NH3, volatile fatty acids (VFA's concentrations and pH in beef cattle grazing Brachiaria brizantha cv. Marandu. The microbial protein synthesis
Quantitative protein and fat metabolism in bull calves treated with beta-adrenergic agonist

DEFF Research Database (Denmark)

Chwalibog, André; Jensen, K; Thorbek, G

1996-01-01

Protein and energy utilization and quantitative retention of protein, fat and energy was investigated with 12 Red Danish bulls during two subsequent 6 weeks trials (Sections A and B) at a mean live weight of 195 and 335 kg respectively. Treatments were control (Group 1) and beta-agonist (L-644...... matter, metabolizable energy and digestible protein was of the same magnitude for all groups. The beta-agonist had no significant effect on protein digestibility and metabolizability of energy, but daily live weight gain was significantly higher in the treated bulls. The utilization of digested protein...
Evaluation of energy digestibility and prediction of digestible and metabolizable energy from chemical composition of different cottonseed meal sources fed to growing pigs.

Science.gov (United States)

Li, J T; Li, D F; Zang, J J; Yang, W J; Zhang, W J; Zhang, L Y

2012-10-01

The present experiment was conducted to determine the digestible energy (DE), metabolizable energy (ME) content, and the apparent total tract digestibility (ATTD) of energy in growing pigs fed diets containing one of ten cottonseed meals (CSM) collected from different provinces of China and to develop in vitro prediction equations for DE and ME content from chemical composition of the CSM samples. Twelve growing barrows with an initial body weight of 35.2±1.7 kg were allotted to two 6×6 Latin square designs, with six barrows and six periods and six diets for each. A corn-dehulled soybean meal diet was used as the basal diet, and the other ten diets were formulated with corn, dehulled soybean meal and 19.20% CSM. The DE, ME and ATTD of gross energy among different CSM sources varied largely and ranged from 1,856 to 2,730 kcal/kg dry matter (DM), 1,778 to 2,534 kcal/kg DM, and 42.08 to 60.47%, respectively. Several chemical parameters were identified to predict the DE and ME values of CSM, and the accuracy of prediction models were also tested. The best fit equations were: DE, kcal/kg DM = 670.14+31.12 CP+659.15 EE with R(2) = 0.82, RSD = 172.02, penergy varied substantially among different CSM sources, and that some prediction equations can be applied to predict DE and ME in CSM with an acceptable accuracy.

Corn and soybean meal metabolizable energy with the addition of exogenous enzymes for poultry

Directory of Open Access Journals (Sweden)

LRB Dourado

2009-03-01

Full Text Available Two metabolism assays were carried out to determine corn and soybean meal metabolizable energy when enzymes were added. In the first trial, 35 cockerels per studied feedstuff (corn and soybean meal were distributed in a completely randomized experimental design with four treatments of seven replicates of one bird each. The evaluated treatments were: ingredient (corn and soybean meal with no enzyme addition, with the addition of an enzyme complex (xylanase, amylase, protease - XAP, xylanase, or phytase. Precise feeding method was used to determine true metabolizable energy corrected for nitrogen balance (TMEn. The use of enzymes did not result in any differences (p>0.05 in soybean meal TMEn, but phytase improved corn TMEn in 2.3% (p=0.004. In the second trial, 280 seven-day-old broiler chicks were distributed in a completely randomized experimental design with seven treatments of five replicates of eight birds each. Treatments consisted of corn with no enzyme addition or with the addition of amylase, xylanase, phytase, XAP complex, XAP+phytase combination, or xylanase/ pectinase/β-glucanase complex (XPBG. Corn was supplemented with macro and trace minerals. Total excreta collection was used to determine apparent metabolizable energy corrected for nitrogen balance (AMEn. Differences were observed (p=0.08 in AMEn and dry matter metabolizability coefficient (p=0.03. The combination of the XAP complex with phytase promoted a 2.11% increase in corn AMEn values, and the remaining enzymes allowed increased between 0.86% and 1.66%.
Computational prediction of protein-protein interactions in Leishmania predicted proteomes.

Directory of Open Access Journals (Sweden)

Antonio M Rezende

Full Text Available The Trypanosomatids parasites Leishmania braziliensis, Leishmania major and Leishmania infantum are important human pathogens. Despite of years of study and genome availability, effective vaccine has not been developed yet, and the chemotherapy is highly toxic. Therefore, it is clear just interdisciplinary integrated studies will have success in trying to search new targets for developing of vaccines and drugs. An essential part of this rationale is related to protein-protein interaction network (PPI study which can provide a better understanding of complex protein interactions in biological system. Thus, we modeled PPIs for Trypanosomatids through computational methods using sequence comparison against public database of protein or domain interaction for interaction prediction (Interolog Mapping and developed a dedicated combined system score to address the predictions robustness. The confidence evaluation of network prediction approach was addressed using gold standard positive and negative datasets and the AUC value obtained was 0.94. As result, 39,420, 43,531 and 45,235 interactions were predicted for L. braziliensis, L. major and L. infantum respectively. For each predicted network the top 20 proteins were ranked by MCC topological index. In addition, information related with immunological potential, degree of protein sequence conservation among orthologs and degree of identity compared to proteins of potential parasite hosts was integrated. This information integration provides a better understanding and usefulness of the predicted networks that can be valuable to select new potential biological targets for drug and vaccine development. Network modularity which is a key when one is interested in destabilizing the PPIs for drug or vaccine purposes along with multiple alignments of the predicted PPIs were performed revealing patterns associated with protein turnover. In addition, around 50% of hypothetical protein present in the networks
Efeito do processamento da soja integral sobre a energia metabolizável e a digestibilidade dos aminoácidos para aves Effects of processing on full fat soybean metabolizable energy determined by different methodologies and digestibility of amino acids for poultry

Directory of Open Access Journals (Sweden)

Ednardo Rodrigues Freitas

2005-12-01

Full Text Available Essa pesquisa foi desenvolvida para se avaliar o valor nutricional da soja integral submetida a diferentes processamentos. Quatro ensaios de metabolismo foram conduzidos para determinar os coeficientes de digestibilidade da matéria seca (CDMS, do extrato etéreo (CDEE e dos aminoácidos e os valores de energia metabolizável aparente e verdadeira da soja integral desativada (SID e extrusada (SIE e das misturas de farelo de soja com óleo degomado de soja (FSO ou com óleo ácido de soja (FSOA. Nos ensaios um e dois, utilizou-se a metodologia tradicional de coleta total de excretas com pintos e galos, respectivamente. A metodologia da alimentação forçada com galos adultos intactos foi utilizada no ensaio três, e com galos cecotomizados, no ensaio quatro. Os CDMS e CDEE e a energia metabolizável determinados com galos foram superiores aos determinados com pintos. Os maiores valores de CDMS, CDEE e de energia metabolizável foram obtidos para FSO, seguidos pelos da SIE e FSOA, e os menores, para a SID. O processo de extrusão proporcionou melhores resultados no aproveitamento da gordura do grão de soja e, conseqüentemente, da energia. Entretanto, a digestibilidade dos aminoácidos não foi influenciada pelos processamentos. Os diferentes processamentos conferiram à soja integral características nutricionais que se distinguiram, principalmente quanto ao valor de energia metabolizável, que também variou com a idade das aves.This work was conducted to evaluate the nutritional value of whole soybean submitted to different processing. Four trials were conducted to determine digestibility coefficients of dry matter (DCDM, ether extract (DCEE and amino acids and the values of apparent and true metabolizable energy of the whole deactivated soybean processed by steam heated (WDS and whole extruded soybean (WES compared to a mixture of soybean meal plus soybean oil (SMO and soybean meal plus soybean acid oil (SMAO. The trials one and two were
Níveis de energia metabolizável em rações para frangos de corte mantidos em ambiente de alta temperatura Metabolizable energy levels in diets for broiler maintained in environment of high temperature

Directory of Open Access Journals (Sweden)

Firmino José Vieira Barbosa

2008-05-01

Full Text Available Quatrocentas aves com peso médio de 675,00 g foram distribuídas em delineamento de blocos casualizados, com base no peso das aves, com cinco tratamentos e quatro repetições. As dietas experimentais foram constituídas de cinco níveis de energia metabolizável (2.800, 2.900, 3.000, 3.100 e 3.200 kcal de EM/kg de ração formuladas para atender às exigências nutricionais, exceto de energia metabolizável. O aumento do nível de energia das rações foi obtido pela adição de óleo de soja. Realizaram-se análises de variância e de regressão, associando-se os níveis de energia aos valores das variáveis estudadas. As aves foram avaliadas quanto ao desempenho (consumo de ração, ganho de peso e conversão alimentar e às características de carcaça nos períodos de 22 a 35 dias, 36 a 42 dias, 43 aos 49 dias e de 22 a 49 dias de idade. O ganho de peso e a conversão alimentar de frangos de corte da linhagem Hubbard mantidos em ambiente de alta temperatura não são influenciados pelos níveis de energia metabolizável da ração. Os níveis de energia da dieta não afetam os rendimentos de carcaça, coxa, sobrecoxa, asa, tulipa, moela coração fígado, proventrículo e intestino. Entretanto, a gordura abdominal aumenta e o rendimento de peito decresce proporcionalmente à elevação da energia da dieta em ambiente de altas temperaturas.Four hundred birds Hubbard linage with average weight of 675g were distributed to completely randomized block design, based in birds he weight, with five treatments and four replications. The experimental diets were constituted of five metabolizable energy levels (2,800, 2,900, 3,000, 3,100 and 3,200 kcal of ME/kg ration formulated to attend the nutritional requirements, except for metabolizable energy. The increase of energy was obtained by the addition of soybean oil. The variance and regression analysis was made, associating the energy levels with the values of the studied characteristics. The birds
Effect of caecectomy on true metabolizable energy and lysine ...

African Journals Online (AJOL)

True metabolizable energy, conected for nitrogen retention. cfMEx) and true rysine avilabiliry, was determined for maize and sunflower oilcake o,?, t*o ,u*ii"s of each, *O in-r*ples of fishmeal, soyabean oilcake -"ur -a sorghum meal using intact and caecectomized roosters. The roosters were allowed ad tibitum intake over a ...
Diferentes teores de proteína metabolizável em rações com cana-de-açúcar para vacas em lactação Different metabolizable protein levels in sugar cane diets to lactating dairy cows

Directory of Open Access Journals (Sweden)

Hugo Imaizumi

2008-07-01

Full Text Available
O presente trabalho teve como objetivo avaliar o efeito de diferentes teores de proteína metabolizável (PM na ração de vacas lactantes alimentadas com cana-de-açúcar. Foram utilizadas dezoito vacas em lactação divididas em dois grupos de produção de leite (10 ou 18 kg/d, sendo os dados analisados separadamente. Avaliaram-se três tratamentos variando-se a dose de PM e proteína degradável no rúmen (PDR da ração, a partir de diferentes inclusões de uréia ou farelo de soja: 1 1% da mistura uréia e sulfato de amônia na cana-de-açúcar in natura (controle; 2 teores adequados de PDR e PM; e 3 teores adequados de PDR e excessivos de PM. O delineamento estatístico foi o quadrado latino 3 x 3 com três repetições para cada grupo. Não se observaram diferenças estatísticas dos tratamentos (P>0,05 sobre o consumo de matéria seca, a produção de leite, os teores de gordura e de proteína do leite, as concentrações de nitrogênio uréico no leite (NUL e no plasma (NUP, independentemente do nível de produção dos animais. A recomendação de corrigir as rações com cana-de-açúcar in natura com 1% da mistura uréia-sulfato de amônia foi adequada tanto para vacas produzindo 10 quanto 18 kg de leite/d. Não houve vantagem em aumentar o suprimento de PM para essas vacas.

PALAVRAS-CHAVES: Fontes protéicas, produção de leite, proteína degradável no rúmen, uréia.

The objective of this trial was to evaluate the effect of different metabolizable protein (MP levels on lactating dairy cows fed sugarcane diets. Eighteen lactating cows, divided in two groups of milk production (10 or 18 kg/d, were used, being data analyzed separately. Three treatments with varying levels of MP and rumen degradable protein (RDP, by different urea and soybean meal inclusions, were evaluated: 1 1% of the mixture urea and ammonium
Cloud prediction of protein structure and function with PredictProtein for Debian.

Science.gov (United States)

Kaján, László; Yachdav, Guy; Vicedo, Esmeralda; Steinegger, Martin; Mirdita, Milot; Angermüller, Christof; Böhm, Ariane; Domke, Simon; Ertl, Julia; Mertes, Christian; Reisinger, Eva; Staniewski, Cedric; Rost, Burkhard

2013-01-01

We report the release of PredictProtein for the Debian operating system and derivatives, such as Ubuntu, Bio-Linux, and Cloud BioLinux. The PredictProtein suite is available as a standard set of open source Debian packages. The release covers the most popular prediction methods from the Rost Lab, including methods for the prediction of secondary structure and solvent accessibility (profphd), nuclear localization signals (predictnls), and intrinsically disordered regions (norsnet). We also present two case studies that successfully utilize PredictProtein packages for high performance computing in the cloud: the first analyzes protein disorder for whole organisms, and the second analyzes the effect of all possible single sequence variants in protein coding regions of the human genome.
Exigências nutricionais de proteína, energia e macrominerais de bovinos Nelore de três classes sexuais Nutrient requirements of protein, energy and macrominerals of Nellore cattle of three genders

Directory of Open Access Journals (Sweden)

Marcos Inácio Marcondes

2009-08-01

Full Text Available Objetivou-se determinar as exigências de energia metabolizável para mantença, as exigências líquidas de proteína, energia e macrominerais para ganho de peso e a eficiência de transformação de exigências líquidas de proteína para ganho em exigências de proteína metabolizável em bovinos Nelore. Foram utilizados 27 animais (nove machos castrados, nove machos não-castrados e nove fêmeas. Três animais de cada classe foram abatidos ao início do experimento como grupo referência. Os 18 animais remanescentes receberam concentrado (1 ou 1,25% do peso vivo durante 112 dias e foram abatidos ao final, para determinação de sua composição corporal. As exigências líquidas para ganho de peso foram obtidas derivando-se a equação de predição do conteúdo corporal de cada nutriente em função do logaritmo do peso de corpo vazio. As exigências de energia metabolizável para mantença foram estimadas a partir da regressão linear da energia retida em relação ao consumo de energia metabolizável, enquanto a eficiência de uso da proteína metabolizável para ganho de peso foi estimada pela equação da proteína bruta retida em relação ao consumo de proteína metabolizável. As exigências líquidas de minerais estão de acordo com os valores encontrados na literatura. As exigências líquidas de energia para ganho aumentam de acordo com o peso vivo e as exigências líquidas de proteína para ganho diminuem com o aumento do peso. A eficiência de conversão das exigências líquidas de proteína em exigências de proteína metabolizável é de aproximadamente 50%.The objective of this study was to determine the metabolizable energy requirement for maintenance and net requirements of crude protein, energy and macrominerals for weight gain, and also the conversion efficiency of net protein requirements to metabolizable protein requirements in Nellore cattle. Twenty seven Nellore animals (nine bulls, nine steers and nine heifers were
Rumen-protected lysine, methionine, and histidine increase milk protein yield in dairy cows fed a metabolizable protein-deficient diet.

Science.gov (United States)

Lee, C; Hristov, A N; Cassidy, T W; Heyler, K S; Lapierre, H; Varga, G A; de Veth, M J; Patton, R A; Parys, C

2012-10-01

The objective of this experiment was to evaluate the effect of supplementing a metabolizable protein (MP)-deficient diet with rumen-protected (RP) Lys, Met, and specifically His on dairy cow performance. The experiment was conducted for 12 wk with 48 Holstein cows. Following a 2-wk covariate period, cows were blocked by DIM and milk yield and randomly assigned to 1 of 4 diets, based on corn silage and alfalfa haylage: control, MP-adequate diet (ADMP; MP balance: +9 g/d); MP-deficient diet (DMP; MP balance: -317 g/d); DMP supplemented with RPLys (AminoShure-L, Balchem Corp., New Hampton, NY) and RPMet (Mepron; Evonik Industries AG, Hanau, Germany; DMPLM); and DMPLM supplemented with an experimental RPHis preparation (DMPLMH). The analyzed crude protein content of the ADMP and DMP diets was 15.7 and 13.5 to 13.6%, respectively. The apparent total-tract digestibility of all measured nutrients, plasma urea-N, and urinary N excretion were decreased by the DMP diets compared with ADMP. Milk N secretion as a proportion of N intake was greater for the DMP diets compared with ADMP. Compared with ADMP, dry matter intake (DMI) tended to be lower for DMP, but was similar for DMPLM and DMPLMH (24.5, 23.0, 23.7, and 24.3 kg/d, respectively). Milk yield was decreased by DMP (35.2 kg/d), but was similar to ADMP (38.8 kg/d) for DMPLM and DMPLMH (36.9 and 38.5kg/d, respectively), paralleling the trend in DMI. The National Research Council 2001model underpredicted milk yield of the DMP cows by an average (±SE) of 10.3 ± 0.75 kg/d. Milk fat and true protein content did not differ among treatments, but milk protein yield was increased by DMPLM and DMPLMH compared with DMP and was not different from ADMP. Plasma essential amino acids (AA), Lys, and His were lower for DMP compared with ADMP. Supplementation of the DMP diets with RP AA increased plasma Lys, Met, and His. In conclusion, MP deficiency, approximately 15% below the National Research Council requirements from 2001, decreased
Prediction of Protein-Protein Interactions Related to Protein Complexes Based on Protein Interaction Networks

Directory of Open Access Journals (Sweden)

Peng Liu

2015-01-01

Full Text Available A method for predicting protein-protein interactions based on detected protein complexes is proposed to repair deficient interactions derived from high-throughput biological experiments. Protein complexes are pruned and decomposed into small parts based on the adaptive k-cores method to predict protein-protein interactions associated with the complexes. The proposed method is adaptive to protein complexes with different structure, number, and size of nodes in a protein-protein interaction network. Based on different complex sets detected by various algorithms, we can obtain different prediction sets of protein-protein interactions. The reliability of the predicted interaction sets is proved by using estimations with statistical tests and direct confirmation of the biological data. In comparison with the approaches which predict the interactions based on the cliques, the overlap of the predictions is small. Similarly, the overlaps among the predicted sets of interactions derived from various complex sets are also small. Thus, every predicted set of interactions may complement and improve the quality of the original network data. Meanwhile, the predictions from the proposed method replenish protein-protein interactions associated with protein complexes using only the network topology.
Information assessment on predicting protein-protein interactions

Directory of Open Access Journals (Sweden)

Gerstein Mark

2004-10-01

Full Text Available Abstract Background Identifying protein-protein interactions is fundamental for understanding the molecular machinery of the cell. Proteome-wide studies of protein-protein interactions are of significant value, but the high-throughput experimental technologies suffer from high rates of both false positive and false negative predictions. In addition to high-throughput experimental data, many diverse types of genomic data can help predict protein-protein interactions, such as mRNA expression, localization, essentiality, and functional annotation. Evaluations of the information contributions from different evidences help to establish more parsimonious models with comparable or better prediction accuracy, and to obtain biological insights of the relationships between protein-protein interactions and other genomic information. Results Our assessment is based on the genomic features used in a Bayesian network approach to predict protein-protein interactions genome-wide in yeast. In the special case, when one does not have any missing information about any of the features, our analysis shows that there is a larger information contribution from the functional-classification than from expression correlations or essentiality. We also show that in this case alternative models, such as logistic regression and random forest, may be more effective than Bayesian networks for predicting interactions. Conclusions In the restricted problem posed by the complete-information subset, we identified that the MIPS and Gene Ontology (GO functional similarity datasets as the dominating information contributors for predicting the protein-protein interactions under the framework proposed by Jansen et al. Random forests based on the MIPS and GO information alone can give highly accurate classifications. In this particular subset of complete information, adding other genomic data does little for improving predictions. We also found that the data discretizations used in the
Protein Structure Prediction by Protein Threading

Science.gov (United States)

Xu, Ying; Liu, Zhijie; Cai, Liming; Xu, Dong

The seminal work of Bowie, Lüthy, and Eisenberg (Bowie et al., 1991) on "the inverse protein folding problem" laid the foundation of protein structure prediction by protein threading. By using simple measures for fitness of different amino acid types to local structural environments defined in terms of solvent accessibility and protein secondary structure, the authors derived a simple and yet profoundly novel approach to assessing if a protein sequence fits well with a given protein structural fold. Their follow-up work (Elofsson et al., 1996; Fischer and Eisenberg, 1996; Fischer et al., 1996a,b) and the work by Jones, Taylor, and Thornton (Jones et al., 1992) on protein fold recognition led to the development of a new brand of powerful tools for protein structure prediction, which we now term "protein threading." These computational tools have played a key role in extending the utility of all the experimentally solved structures by X-ray crystallography and nuclear magnetic resonance (NMR), providing structural models and functional predictions for many of the proteins encoded in the hundreds of genomes that have been sequenced up to now.
Food processing and structure impact the metabolizable energy of almonds.

Science.gov (United States)

Gebauer, Sarah K; Novotny, Janet A; Bornhorst, Gail M; Baer, David J

2016-10-12

The measured metabolizable energy (ME) of whole almonds has been shown to be less than predicted by Atwater factors. However, data are lacking on the effects of processing (roasting, chopping or grinding) on the ME of almonds. A 5-period randomized, crossover study in healthy individuals (n = 18) was conducted to measure the ME of different forms of almonds (42 g per day), as part of a controlled diet: whole, natural almonds; whole, roasted almonds; chopped almonds; almond butter; and control (0 g per day). After 9 days of adaptation to each diet, participants collected all urine and fecal samples for 9 days. Diets, urine, and feces were analyzed to determine ME. Fracture force and fracture properties of whole and chopped almonds were measured. Measured ME (kcal g -1 ) of whole natural almonds (4.42), whole roasted almonds (4.86), and chopped almonds (5.04) was significantly lower than predicted with Atwater factors (P almond butter (6.53 kcal g -1 ) was similar to predicted (P = 0.08). The ME of whole roasted and chopped almonds was lower than almond butter (P almonds was lower than whole roasted almonds (P almonds (345 ± 1.6 N) (P almonds fracturing into fewer, larger particles, thus inhibiting the release of lipids. Atwater factors overestimate the ME of whole (natural and roasted) and chopped almonds. The amount of calories absorbed from almonds is dependent on the form in which they are consumed.
Efeito do nível de energia metabolizável da dieta no desempenho e metabolismo energético de frangos de corte Effect of dietary metabolizable energy on energy metabolism and performance in broiler chickens

Directory of Open Access Journals (Sweden)

Nilva Kazue Sakomura

2004-12-01

requirements and the energy efficiency utilization. The method used was the comparative slaughter. Twelve birds were sacrificed in the beginning of trial to determine the initial body composition and 144 birds were used to determine the AMEn of diets according to ME levels and feeding levels. The dietary ME levels affected the ME intake and consequently in retained body energy (RE and heat production (HP. The maintenance metabolizable energy requirements were: 131, 141 and 132 kcal/kg of W0.75/day for 3,350; 3,200 and 3,050 ME/kg, respectively. The highest ME level promoted better performance of the birds, however, the medium ME level provide better values of efficiency for protein deposition and for fat deposition, consequently, lower body fat composition, showing better carcass quality.
Validation of protein evaluation systems by means of milk production experiments with dairy cows.

NARCIS (Netherlands)

Straalen, van W.M.; Salaün, C.; Veen, W.A.G.; Rypkema, Y.S.; Hof, G.; Boxem, T.J.

1994-01-01

Protein evaluation systems (crude protein (CP), digestible crude protein (DCP), protein digested in the intestine (PDI), amino acids truly absorbed in the small intestine (AAT), absorbed protein (AP), metabolizable protein (MP), crude protein flow at the duodenum (AAS) and digestible protein in
Exigências de energia para mantença e eficiência de utilização da energia metabolizável para mantença e ganho de peso de caprinos Moxotó Energy requirements for maintenance and net efficiency of metabolizable energy utilization for maintenance and weight gain of Moxotó kids

Directory of Open Access Journals (Sweden)

Kaliandra Souza Alves

2008-08-01

Full Text Available Objetivou-se avaliar as exigências líquidas de energia para mantença e as eficiências de utilização da energia metabolizável (EUEM para mantença (km e ganho de peso (kf. Utilizaram-se 26 animais da raça Moxotó, machos não-castrados com peso vivo (PV médio inicial de 15 kg e 7 a 8 meses de idade, alimentados com dieta contendo 2,6 Mcal de energia metabolizável. No início do experimento, seis animais foram abatidos e serviram como referência para estimativa da composição corporal e do peso de corpo vazio (PCVZ iniciais. Posteriormente, os animais foram distribuídos inteiramente ao acaso em grupos homogêneos de quatro animais, um para cada nível de consumo: alimentação à vontade (AV e alimentação restrita (85, 70 e 55% do consumido pelo grupo que recebeu alimentação à vontade. No momento em que o PV dos animais que recebiam alimentação à vontade se aproximava de 25 kg, o grupo era abatido. As exigências líquidas de energia para mantença foram estimadas pelas relações logarítmica ou exponencial da produção de calor (PC e o consumo de energia metabolizável (CEM. As km e kf foram calculadas como a relação entre a energia líquida da dieta para ganho ou mantença e a concentração de energia metabolizável das dietas. Posteriormente, estimaram-se as exigências de EM e NDT. A exigência de energia líquida para mantença, de 55,11 kcal/kgPCVZ0,75, foi próxima ao predito pelas normas norte-americanas para essa espécie. Esse valor é considerado baixo se comparado aos reportados na literatura brasileira pesquisada. A km estimada foi de 0,57 e as kf para as concentrações de 2,99; 2,95; 2,56 e 2,5 Mcal/kg de MS foram de 0,22; 0,19; 0,28 e 0,36, respectivamente.Twenty-six Moxotó non-castrated male kids (averaging 15 kg of initial live weight (LW and 6-8 months old fed a diet with 2.6 Mcal of metabolizable energy were used to predict the net energy requirements for maintenance and net efficiency of
Metabolizable energy and oil intake in brown commercial layers

Directory of Open Access Journals (Sweden)

Amadeu Benedito Piozzi da Silva

2012-10-01

Full Text Available With the objective to establish the best metabolizable energy (ME intake for layers, and the best dietary vegetable oil addition level to optimize egg production, an experiment was carried out with 432 30-week-old Hisex Brown layers. Birds were distributed into nine treatments with six replicates of eight birds each according to a 3 × 3 factorial arrangement, consisting of three daily metabolizable energy intake (280, 300 or 320 kcal/bird/day and three oil levels (0.00; 0.75 and 1.50 g/bird/day. Daily feed intake was limited to 115, 110 and 105 g/bird in order to obtain the desired energy and oil intake in each treatment. The following parameters were evaluated: initial weight, final weight, body weight change, egg production, egg mass, feed conversion ratio per dozen eggs and per egg mass and energy conversion. There was no influence of the treatments on egg production (% or egg mass (g/bird/day. Final weight and body weight change were significantly affected by increasing energy intake. Feed conversion ratio per egg mass, feed conversion ratio per dozen eggs and energy conversion significantly worsened as a function of the increase in daily energy intake. An energy intake of 280 kcal/bird/day with no addition of dietary oil does not affect layer performance.
The bactericidal activity of β-lactam antibiotics is increased by metabolizable sugar species

DEFF Research Database (Denmark)

Thorsing, Mette; Bentin, Thomas; Givskov, Michael

2015-01-01

Here, the influence of metabolizable sugars on the susceptibility of Escherichia coli to β-lactam antibiotics was investigated. Notably, monitoring growth and survival of mono- and combination-treated planktonic cultures showed a 1000- to 10 000-fold higher antibacterial efficacy of carbenicillin...... and cefuroxime in the presence of certain sugars, whereas other metabolites had no effect on β-lactam sensitivity. This effect was unrelated to changes in growth rate. Light microscopy and flow cytometry profiling revealed that bacterial filaments, formed due to β-lactam-mediated inhibition of cell division......, rapidly appeared upon β-lactam mono-treatment and remained stable for up to 18 h. The presence of metabolizable sugars in the medium did not change the rate of filamentation, but led to lysis of the filaments within a few hours. No lysis occurred in E. coli mutants unable to metabolize the sugars, thus...
Extrusion enhances metabolizable energy and ileal amino acids digestibility of canola meal for broiler chickens

Directory of Open Access Journals (Sweden)

Aljuobori Ahmed

2014-01-01

Full Text Available The aim of the current study was to determine the effect of extrusion process on apparent metabolizable energy (AME, crude protein (CP and amino acid (AA digestibility of canola meal (CM in broiler chickens. A total of 36, 42-day-old broilers were randomly assigned into adaptation diets (no CM or 30% CM with six replicates. After 4 days of adaptation period, on day 47, birds were allowed to consume the assay diets that contain CM or extruded canola meal (ECM as the sole source of energy and protein. Following 4 h after feeding, the birds were killed and ileal contents were collected. The results showed that ECM had greater (P<0.001 AME (10.87 vs 9.39 MJ/kg compared to CM. The extrusion also significantly enhanced apparent ileal digestibility of CP and some of AA such as Asp, Glu, Ser, Thr and Trp. In conclusion, the extrusion treatment appeared to be a practical and effective approach in enhancing the digestibility of AME, CP and some AA of CM in broiler chickens.
Determination and Prediction of Digestible and Metabolizable Energy from the Chemical Composition of Chinese Corn Gluten Feed Fed to Finishing Pigs

Directory of Open Access Journals (Sweden)

T. T. Wang

2014-06-01

Full Text Available Two experiments were conducted to determine the digestible energy (DE and metabolizable energy (ME contents of corn gluten feed (CGF for finishing pigs and to develop equations predicting the DE and ME content from the chemical composition of the CGF samples, as well as validate the accuracy of the prediction equations. In Exp. 1, ten CGF samples from seven provinces of China were collected and fed to 66 finishing barrows (Duroc×Landrace×Yorkshire with an initial body weight (BW of 51.9±5.5 kg. The pigs were assigned to 11 diets comprising one basal diet and 10 CGF test diets with six pigs fed each diet. The basal diet contained corn (76%, dehulled soybean meal (21% and premix (3%. The ten test diets were formulated by substituting 25% of the corn and dehulled soybean meal with CGF and contained corn (57%, dehulled soybean meal (15.75%, CGF (24.25% and premix (3%. In Exp. 2, two additional CGF sources were collected as validation samples to test the accuracy of the prediction equations. In this experiment, 18 barrows (Duroc×Landrace×Yorkshire with an initial BW of 61.1±4.0 kg were randomly allotted to be fed either the basal diet or two CGF containing diets which had a similar composition as used in Exp. 1. The DE and ME of CGF ranged from 10.37 to 12.85 MJ/kg of dry matter (DM and 9.53 to 12.49 MJ/kg of DM, respectively. Through stepwise regression analysis, several prediction equations of DE and ME were generated. The best fit equations were: DE, MJ/kg of DM = 18.30–0.13 neutral detergent fiber–0.22 ether extract, with R2 = 0.95, residual standard deviation (RSD = 0.21 and p<0.01; and ME, MJ/kg of DM = 12.82+0.11 Starch–0.26 acid detergent fiber, with R2 = 0.94, RSD = 0.20 and p<0.01. These results indicate that the DE and ME content of CGF varied substantially but the DE and ME for finishing pigs can be accurately predicted from equations based on nutritional analysis.

Nitrogen-corrected True Metabolizable Energy and Amino Acid Digestibility of Chinese Corn Distillers Dried Grains with Solubles in Adult Cecectomized Roosters

Directory of Open Access Journals (Sweden)

F. Li

2013-06-01

Full Text Available This study was conducted to evaluate chemical composition, nitrogen-corrected true metabolizable energy (TMEn and true amino acids digestibility of corn distillers dried grains with solubles (DDGS produced in China. Twenty five sources of corn DDGS was collected from 8 provinces of China. A precision-fed rooster assay was used to determine TMEn and amino acids digestibility with 35 adult cecectomized roosters, in which each DDGS sample was tube fed (30 g. The average content of ash, crude protein, total amino acid, ether extract, crude fiber and neutral detergent fiber were 4.81, 27.91, 22.51, 15.22, 6.35 and 37.58%, respectively. TMEn of DDGS ranged from 1,779 to 3,071 kcal/kg and averaged 2,517 kcal/kg. Coefficient of variation for non-amino acid crude protein, ether extract, crude fiber and TMEn were 55.0, 15.7, 15.9 and 17.1%, respectively. The average true amino acid digestibility was 77.32%. Stepwise regression analysis obtained the following equation: TMEn, kcal/kg = −2,995.6+0.88×gross energy+49.63×a* (BIC = 248.8; RMSE = 190.8; p0.05. These results suggest that corn DDGS produced in China has a large variation in chemical composition, and gross energy and a* value can be used to generate TMEn predict equation.
Protein complex prediction in large ontology attributed protein-protein interaction networks.

Science.gov (United States)

Zhang, Yijia; Lin, Hongfei; Yang, Zhihao; Wang, Jian; Li, Yanpeng; Xu, Bo

2013-01-01

Protein complexes are important for unraveling the secrets of cellular organization and function. Many computational approaches have been developed to predict protein complexes in protein-protein interaction (PPI) networks. However, most existing approaches focus mainly on the topological structure of PPI networks, and largely ignore the gene ontology (GO) annotation information. In this paper, we constructed ontology attributed PPI networks with PPI data and GO resource. After constructing ontology attributed networks, we proposed a novel approach called CSO (clustering based on network structure and ontology attribute similarity). Structural information and GO attribute information are complementary in ontology attributed networks. CSO can effectively take advantage of the correlation between frequent GO annotation sets and the dense subgraph for protein complex prediction. Our proposed CSO approach was applied to four different yeast PPI data sets and predicted many well-known protein complexes. The experimental results showed that CSO was valuable in predicting protein complexes and achieved state-of-the-art performance.
Energia metabolizável de ingredientes protéicos determinada pelo método de coleta total e por equações de predição Metabolizable energy of proteics feedstuffs, determined by the total collection excreta and prediction equations

Directory of Open Access Journals (Sweden)

Márcia Cristina de Mello Zonta

2004-12-01

Full Text Available Um ensaio metabólico com pintos em crescimento (método tradicional de coleta total de excretas foi conduzido para determinar a energia metabolizável aparente corrigida (EMAn de alguns alimentos, bem como a determinação dessa energia por equações de predição descritas na literatura. Determinou-se a EMAn de oito alimentos, sendo cinco amostras de farelos de soja de diferentes marcas comerciais e três amostras de soja integral (extrusada, tostada e micronizada. Os valores estimados pelas equações de predição foram comparados com os observados, utilizando-se a correlação de Spearman e intervalos de confiança obtidos a partir dos valores de EMAn determinados no ensaio metabólico. Os valores energéticos das amostras de farelos de soja 1, 2, 3, 4, e 5, sojas integrais extrusada, tostada e micronizada foram 2601, 2650, 2727, 2500, 2426, 3674, 3609, 4296 kcal/kg de MS, respectivamente, para a EMAn determinada com frangos de corte no ensaio metabólico. Entre as equações estudadas, as equações EMAn = -822,33 + 69,54PB - 45,26FDA + 90,81EE e EMAn = 2723,05 - 50,52FDA + 60,40EE foram as que mais se correlacionaram (PA metabolism assay were carried out with broilers in growth phase (traditional method of total collection of excreta to determinate the nitrogen-corrected apparent metabolizable energy (AMEn of some feedstuffs, as well as the determination of the energy values by prediction equations published in the national and international pappers. It was determined AMEn of eight fedstuffs, five soybean meal samples and three processed full fat samples (extruded, toasted and micronized. The estimated values were compared with observed, using the Spearman correlation and confidence intervals obtained by the metabolic assay. The energy values of soybean meals samples (1, 2, 3, 4 and 5, full fat soybean extruded, toasted and micronized were 2601, 2650, 2727, 2500, 2426, 3674, 3609, 4296 kcal/kg DM, respectively. Among the studied
Composição química, valores de energia metabolizável e aminoácidos digestíveis de subprodutos do arroz para frangos de corte Chemical composition, metabolizable energy and digestible amino acids values of rice by-products for broilers

Directory of Open Access Journals (Sweden)

Otto Mack Junqueira

2009-11-01

eight birds each one. In the second trial, it was used the forced feeding method with cecectomized cockerels to determine the digestibility coefficients of the amino acids. The design was entirely randomized, with two feeds and one fasting and six replicates with one bird each one. The values of DM, CP, EE, CF, AME and AMEn were, respectively, to WRM: 88.6%; 11.8%; 15.3%; 10.2%; 2968kcal kg-1 and 2804kcal kg-1 and to BR: 93.5%; 9.1%; 0.73%; 0.45%; 3338kcal kg-1 and 3239kcal kg-1. The average values of digestibility coefficients of essential and non-essential amino acids were, respectively, 75.9% and 73.9% to WRM and 77.9% and 76.5% to BR. The WRM and the BR can be used in the diets of broilers in substitution to the corn, however showed lower metabolizable energy levels, had higher levels of crude protein and digestible amino acid.
Different protein-protein interface patterns predicted by different machine learning methods.

Science.gov (United States)

Wang, Wei; Yang, Yongxiao; Yin, Jianxin; Gong, Xinqi

2017-11-22

Different types of protein-protein interactions make different protein-protein interface patterns. Different machine learning methods are suitable to deal with different types of data. Then, is it the same situation that different interface patterns are preferred for prediction by different machine learning methods? Here, four different machine learning methods were employed to predict protein-protein interface residue pairs on different interface patterns. The performances of the methods for different types of proteins are different, which suggest that different machine learning methods tend to predict different protein-protein interface patterns. We made use of ANOVA and variable selection to prove our result. Our proposed methods taking advantages of different single methods also got a good prediction result compared to single methods. In addition to the prediction of protein-protein interactions, this idea can be extended to other research areas such as protein structure prediction and design.
Dietas de diferentes densidades energéticas mantendo constante a relação energia metabolizável: nutrientes para codornas japonesas em postura Diets of different energetic densities, keeping constant the metabolizable energy: nutrients ratio, for laying Japanese quails

Directory of Open Access Journals (Sweden)

Guilherme de Souza Moura

2008-09-01

, the diets did not influence the intakes of energy, crude protein, lysine, methionine+cystine, threonine, egg production, commercial egg production, egg mass, energy efficiency per egg mass, energy efficiency per egg dozen, weight gain and quail viability. For Japanese quail in posture, diets with 2,900 and 2,800 kcal ME/kg provided better feed conversion per egg mass and feed conversion per egg dozen, respectively, when the metabolizable to nutrients ratio is kept.
Valores de energia metabolizável de alguns alimentos obtidos com aves de diferentes idades Metabolizable energy values of feedstuffs obtained from poultry at different ages

Directory of Open Access Journals (Sweden)

Heloisa Helena de Carvalho Mello

2009-05-01

Full Text Available Foram realizados quatro ensaios de metabolismo com a finalidade de determinar a energia metabolizável aparente (EMA e a energia metabolizável aparente corrigida (EMAn de dez alimentos em aves em diversas idades. Os alimentos testados foram: milho, farelo de soja, sorgo, farelo de trigo, farelo de arroz integral, duas farinhas de penas, duas farinhas de vísceras e plasma sangüíneo. Utilizou-se o método de coleta total de excretas, em delineamento experimental inteiramente casualizado, com 11 tratamentos (dez alimentos e uma ração-referência e seis repetições. No primeiro ensaio, foram utilizados 528 pintos de corte machos de 10 a 17 dias de idade, totalizando oito aves por repetição; no segundo ensaio, 396 frangos de corte machos de 26 a 33 dias de idade, com seis aves por repetição; no terceiro ensaio, 264 frangos de corte machos de 40 a 47 dias de idade, com quatro aves por repetição; e, no quarto ensaio, 132 galos, com duas aves por repetição. A idade das aves influenciou os valores de EMA e EMAn do farelo de soja, do sorgo, do farelo de arroz integral, das farinhas de penas e do plasma sangüíneo, enquanto, para o farelo de trigo, teve efeito apenas sobre a EMAn.Four assays were carried out to determine the apparent metabolizable energy (AME and the corrected apparent nitrogen metabolizable energy (AMEn of ten feeds for poultry at different poultry ages. The feeds studied were: corn grain, soybean meal, ground sorghum, wheat bran, integral rice bran, two kinds of feather meal, two kinds of poultry viscera meal and spray-dried plasma. The method of total excreta collection was used to determine the AME and AMEn values. The broiler chicks were distributed on a completely randomized experimental design, with 11 treatments, six replications, with different number of animals by replication according to the assay. In the first assay, 528 male broiler chicks in the period from 10 to 17 days of age were used, with eight animals
Prediction of Protein Configurational Entropy (Popcoen).

Science.gov (United States)

Goethe, Martin; Gleixner, Jan; Fita, Ignacio; Rubi, J Miguel

2018-03-13

A knowledge-based method for configurational entropy prediction of proteins is presented; this methodology is extremely fast, compared to previous approaches, because it does not involve any type of configurational sampling. Instead, the configurational entropy of a query fold is estimated by evaluating an artificial neural network, which was trained on molecular-dynamics simulations of ∼1000 proteins. The predicted entropy can be incorporated into a large class of protein software based on cost-function minimization/evaluation, in which configurational entropy is currently neglected for performance reasons. Software of this type is used for all major protein tasks such as structure predictions, proteins design, NMR and X-ray refinement, docking, and mutation effect predictions. Integrating the predicted entropy can yield a significant accuracy increase as we show exemplarily for native-state identification with the prominent protein software FoldX. The method has been termed Popcoen for Prediction of Protein Configurational Entropy. An implementation is freely available at http://fmc.ub.edu/popcoen/ .
Níveis de energia metabolizável em rações formuladas com base no conceito de proteína ideal e suplementadas com fitase para leitões dos 15 aos 35 kg Metabolizable energy levels in diets formulated according to the ideal protein concept and supplemented with phytase for piglets from 15 to 35 kg

Directory of Open Access Journals (Sweden)

Marcelo José Milagres de Almeida

2008-05-01

15 to 35 kg to evaluate the effect of different dietary levels of metabolizable energy (ME and crude protein (CP formulated according to the ideal protein concept and phytase supplementation on performance and carcass characteristics. The experiment was analyzed as a complete randomized block design, in a factorial arrangement of treatments 3 × 2 + 1, with three ME levels (3,080, 3,230, and 3,380 kcal/kg, two of CP (14% and 16%, supplemented with synthetic amino acids and 1000 FTU/kg of phytase, with calcium level reduced by 25% and phosphorus by 30%. Additional control treatment was formulated with 18% CP and without phytase, to attend the nutrient requirements of pigs according to Brazilian Feedstuffs Tables recommendations. Therefore, seven treatments, six replications and two animals per experimental unit were used. No treatment effect on daily weight gain, feed conversion, daily ME intake and weight of organs (absolute and relative of pigs was observed.Feed intake was higher and fat deposition rate, lower in pigs fed diet with 3,380 kcal ME/kg. Decreasing nitrogen intake, increasing efficiency of nitrogen utilization for gain and lower blood urea content were observed for pigs fed 14% CP diets compared to those with 18%. The reduction of levels of ME, CP, available phosphorus and calcium to 3,080 kcal/kg, 14%, 0.54% and 0.28%, respectively, in rations for pigs, formulated based on ideal protein concept and phytase, do not affect performance of pigs from 15 to 35 kg.
Estimation of the total efficiency of metabolizable energy utilization for maintenance and growth by cattle in tropical conditions Estimação da eficiência total de utilização da energia metabolizável para manutenção e crescimento por bovinos em condições tropicais

Directory of Open Access Journals (Sweden)

Douglas Sampaio Henrique

2005-06-01

Full Text Available Data of 320 animals were obtained from eight comparative slaughter studies performed under tropical conditions and used to estimate the total efficiency of utilization of the metabolizable energy intake (MEI, which varied from 77 to 419 kcal kg-0.75d-1. The provided data also contained direct measures of the recovered energy (RE, which allowed calculating the heat production (HE by difference. The RE was regressed on MEI and deviations from linearity were evaluated by using the F-test. The respective estimates of the fasting heat production and the intercept and the slope that composes the relationship between RE and MEI were 73 kcal kg-0.75d-1, 42 kcal kg-0.75d-1 and 0.37. Hence, the total efficiency was estimated by dividing the net energy for maintenance and growth by the metabolizable energy intake. The estimated total efficiency of the ME utilization and analogous estimates based on the beef cattle NRC model were employed in an additional study to evaluate their predictive powers in terms of the mean square deviations for both temperate and tropical conditions. The two approaches presented similar predictive powers but the proposed one had a 22% lower mean squared deviation even with its more simplified structure.Foram utilizadas 320 informações obtidas a partir de 8 estudos de abate comparativo conduzidos em condições tropicais para se estimar a eficiência total de utilização da energia metabolizável consumida, a qual variou de 77 a 419kcal kg-0.75d-1. Os dados também continham informações sobre a energia retida (RE, o que permitiu o cálculo da produção de calor por diferença. As estimativas da produção de calor em jejum e dos coeficientes linear e angular da regressão entre RE e MEI foram respectivamente, 73 kcal kg-0.75d-1, 42 kcal kg-0.75d-1 e 0,37. Em seguida, a eficiência total foi estimada dividindo-se a energia líquida para mantença e produção pelo consumo de energia metabolizável. A eficiência total de
Neural Networks for protein Structure Prediction

DEFF Research Database (Denmark)

Bohr, Henrik

1998-01-01

This is a review about neural network applications in bioinformatics. Especially the applications to protein structure prediction, e.g. prediction of secondary structures, prediction of surface structure, fold class recognition and prediction of the 3-dimensional structure of protein backbones...
Return of reproductive cyclic activity in Morada Nova sheep at metabolizable energy different levels Retorno da atividade cíclica reprodutiva em ovelhas da raça Morada Nova submetidas a diferentes níveis de energia metabolizável

Directory of Open Access Journals (Sweden)

Maria Teresa Jansem de Almeida Catanho

2008-09-01

Full Text Available The objective, in this work, was to evaluate the effect of metabolizable energy offered on reproductive cyclic activity return of Morada Nova sheep, at different energy levels, during the lactation, since the pregnancy last third, 39 sheep, distributed in three treatments, during the rainy and dry season, were used. Experimental diets were formulated to meet a daily intake of 2.2, 2.8 and 3.4 Mcal of Metabolizable Energy (ME/day and 150 g of crude protein, and the serum levels of progesterone and cortisol regarding, besides the reproductive performance, were evaluated. An entirely casualized design was used in a subdivided scheme for the hormone, and generalized linear models, by the Poisson distribution and binomial distribution in the non-parametric data. There was effect from interaction between period and treatment (P<0,05 for cortisol and progesterone serum, as well as for reproductive performance. The animals went through stress condition, which interfered in the return of reproductive cyclic activity and extended the anoestrus post-partum period, during the dry season. The offer of ME 3.4 Mcal must be recommended for Morada Nova sheep since the pregnancy last third because of the best performance rates and reproductive performance.Este trabalho foi realizado com o objetivo de avaliar os efeitos da energia metabolizável ofertada sobre o retorno da atividade cíclica reprodutiva de ovelhas da raça Morada Nova, durante a lactação, que receberam diferentes níveis de energia desde o terço final da gestação. Foram utilizadas 39 ovelhas distribuídas em três tratamentos durante a época chuvosa e seca. As dietas experimentais foram formuladas para atender ingestão diária de 2,2; 2,8; e 3,4 Mcal de Energia Metabolizável (EM/dia e 150g de proteína bruta, sendo avaliados os níveis séricos de progesterona e cortisol, além do desempenho reprodutivo. Utilizou-se um delineamento inteiramente casualizado, num esquema de parcelas
Determining the amount of rumen-protected methionine supplement that corresponds to the optimal levels of methionine in metabolizable protein for maximizing milk protein production and profit on dairy farms.

Science.gov (United States)

Cho, J; Overton, T R; Schwab, C G; Tauer, L W

2007-10-01

The profitability of feeding rumen-protected Met (RPMet) sources to produce milk protein was estimated using a 2-step procedure: First, the effect of Met in metabolizable protein (MP) on milk protein production was estimated by using a quadratic Box-Cox functional form. Then, using these estimation results, the amounts of RPMet supplement that corresponded to the optimal levels of Met in MP for maximizing milk protein production and profit on dairy farms were determined. The data used in this study were modified from data used to determine the optimal level of Met in MP for lactating cows in the Nutrient Requirements of Dairy Cattle (NRC, 2001). The data used in this study differ from that in the NRC (2001) data in 2 ways. First, because dairy feed generally contains 1.80 to 1.90% Met in MP, this study adjusts the reference production value (RPV) from 2.06 to 1.80 or 1.90%. Consequently, the milk protein production response is also modified to an RPV of 1.80 or 1.90% Met in MP. Second, because this study is especially interested in how much additional Met, beyond the 1.80 or 1.90% already contained in the basal diet, is required to maximize farm profits, the data used are limited to concentrations of Met in MP above 1.80 or 1.90%. This allowed us to calculate any additional cost to farmers based solely on the price of an RPMet supplement and eliminated the need to estimate the dollar value of each gram of Met already contained in the basal diet. Results indicated that the optimal level of Met in MP for maximizing milk protein production was 2.40 and 2.42%, where the RPV was 1.80 and 1.90%, respectively. These optimal levels were almost identical to the recommended level of Met in MP of 2.40% in the NRC (2001). The amounts of RPMet required to increase the percentage of Met in MP from each RPV to 2.40 and 2.42% were 21.6 and 18.5 g/d, respectively. On the other hand, the optimal levels of Met in MP for maximizing profit were 2.32 and 2.34%, respectively. The amounts
Protein function prediction using neighbor relativity in protein-protein interaction network.

Science.gov (United States)

Moosavi, Sobhan; Rahgozar, Masoud; Rahimi, Amir

2013-04-01

There is a large gap between the number of discovered proteins and the number of functionally annotated ones. Due to the high cost of determining protein function by wet-lab research, function prediction has become a major task for computational biology and bioinformatics. Some researches utilize the proteins interaction information to predict function for un-annotated proteins. In this paper, we propose a novel approach called "Neighbor Relativity Coefficient" (NRC) based on interaction network topology which estimates the functional similarity between two proteins. NRC is calculated for each pair of proteins based on their graph-based features including distance, common neighbors and the number of paths between them. In order to ascribe function to an un-annotated protein, NRC estimates a weight for each neighbor to transfer its annotation to the unknown protein. Finally, the unknown protein will be annotated by the top score transferred functions. We also investigate the effect of using different coefficients for various types of functions. The proposed method has been evaluated on Saccharomyces cerevisiae and Homo sapiens interaction networks. The performance analysis demonstrates that NRC yields better results in comparison with previous protein function prediction approaches that utilize interaction network. Copyright © 2012 Elsevier Ltd. All rights reserved.
Protein complex prediction based on k-connected subgraphs in protein interaction network

Directory of Open Access Journals (Sweden)

Habibi Mahnaz

2010-09-01

Full Text Available Abstract Background Protein complexes play an important role in cellular mechanisms. Recently, several methods have been presented to predict protein complexes in a protein interaction network. In these methods, a protein complex is predicted as a dense subgraph of protein interactions. However, interactions data are incomplete and a protein complex does not have to be a complete or dense subgraph. Results We propose a more appropriate protein complex prediction method, CFA, that is based on connectivity number on subgraphs. We evaluate CFA using several protein interaction networks on reference protein complexes in two benchmark data sets (MIPS and Aloy, containing 1142 and 61 known complexes respectively. We compare CFA to some existing protein complex prediction methods (CMC, MCL, PCP and RNSC in terms of recall and precision. We show that CFA predicts more complexes correctly at a competitive level of precision. Conclusions Many real complexes with different connectivity level in protein interaction network can be predicted based on connectivity number. Our CFA program and results are freely available from http://www.bioinf.cs.ipm.ir/softwares/cfa/CFA.rar.
Protein turnover in lactating mink (Mustela vison) is not affected by dietary protein supply

DEFF Research Database (Denmark)

Tauson, Anne-Helene; Fink, Rikke; Chwalibog, André

2006-01-01

The mink is a strict carnivore and may therefore serve as a model for the cat. Current recommendations for protein supply for lactating mink are based on production experiments with preweaning kit growth as a measure of dietary adequacy (1,2). Recently, nitrogen balance and substrate oxidation have...... in humans (7), growing pigs (8), and growing rats (9). In adult cats, both protein synthesis and breakdown were lower when feeding a low- than when feeding a high-protein diet [20 vs. 70% of metabolizable energy (ME)5 from protein] (10). The objectives of this study were therefore to develop a ¹5N...
Protein complex prediction based on k-connected subgraphs in protein interaction network

OpenAIRE

Habibi, Mahnaz; Eslahchi, Changiz; Wong, Limsoon

2010-01-01

Abstract Background Protein complexes play an important role in cellular mechanisms. Recently, several methods have been presented to predict protein complexes in a protein interaction network. In these methods, a protein complex is predicted as a dense subgraph of protein interactions. However, interactions data are incomplete and a protein complex does not have to be a complete or dense subgraph. Results We propose a more appropriate protein complex prediction method, CFA, that is based on ...
Urea Recycling Contributes to Nitrogen Retention in Calves Fed Milk Replacer and Low-Protein Solid Feed

NARCIS (Netherlands)

Berends, H.; Borne, van den J.J.G.C.; Røjen, B.A.; Baal, van J.; Gerrits, W.J.J.

2014-01-01

Urea recycling, with urea originating from catabolism of amino acids and hepatic detoxification of ammonia, is particularly relevant for ruminant animals, in which microbial protein contributes substantially to the metabolizable protein supply. However, the quantitative contribution of urea
Determinação da energia metabolizável de alimentos para codornas japonesas em postura Metabolizable energy of different feedstuffs tested in female Japanese quails

Directory of Open Access Journals (Sweden)

A.M.A. Moura

2010-02-01

Full Text Available Foram determinados os valores de energia metabolizável aparente (EMA, de energia metabolizável aparente corrigida por retenção de nitrogênio (EMAn e do coeficiente de metabolização aparente da energia bruta (CMAEB% do milho, sorgo, farelo de soja, farelo de glúten de milho e óleo de soja refinado. Foram utilizadas 240 codornas japonesas (Coturnix japonica, fêmeas com idade inicial de 60 dias, em delineamento experimental inteiramente ao caso, com seis tratamentos, cinco repetições e oito codornas por unidade experimental. Os tratamentos consistiram de cinco rações experimentais e uma ração referência. Cada ração experimental foi constituída, na base da matéria natural, por 70% da ração referência e 30% do ingrediente a ser testado, com exceção da ração para determinação da EMAn do óleo de soja, com 10% de inclusão e 90% da ração referência. O experimento foi realizado em gaiolas distribuídas em baterias metálicas. Os valores de EMA, EMAn (em kcal/kg de matéria natural e do CMAEB (% do milho moído, sorgo, farelo de soja, farelo de glúten de milho e óleo de soja refinado foram, respectivamente: 3.572 e 3.612kcal/kg e 92,6%; 3.108 e 3.149kcal/kg e 80,9%; 2.633 e 2.676kcal/kg e 65,3%; 4.043 e 4.096kcal/kg e 75,0%; 9.335 e 9.379kcal/kg e 98,8%. Os valores de EMA descritos para outras espécies de aves são discrepantes dos obtidos no presente estudo, não sendo recomendado seu uso em formulação de rações para codornas japonesas em postura.The values of the apparent metabolizable energy (AME, the apparent metabolizable energy corrected for nitrogen retention (AMEn, and the apparent metabolization coefficient of crude energy (AMCCE were determined in corn, sorghum, soybean meal, corn gluten meal, and refined soybean oil. Two-hundred and forty six-day-old female Japanese quails (Coturnix japonica were used in a completely randomized design, with five replicates and eight quails per experimental unit. The
False positive reduction in protein-protein interaction predictions using gene ontology annotations

Directory of Open Access Journals (Sweden)

Lin Yen-Han

2007-07-01

Full Text Available Abstract Background Many crucial cellular operations such as metabolism, signalling, and regulations are based on protein-protein interactions. However, the lack of robust protein-protein interaction information is a challenge. One reason for the lack of solid protein-protein interaction information is poor agreement between experimental findings and computational sets that, in turn, comes from huge false positive predictions in computational approaches. Reduction of false positive predictions and enhancing true positive fraction of computationally predicted protein-protein interaction datasets based on highly confident experimental results has not been adequately investigated. Results Gene Ontology (GO annotations were used to reduce false positive protein-protein interactions (PPI pairs resulting from computational predictions. Using experimentally obtained PPI pairs as a training dataset, eight top-ranking keywords were extracted from GO molecular function annotations. The sensitivity of these keywords is 64.21% in the yeast experimental dataset and 80.83% in the worm experimental dataset. The specificities, a measure of recovery power, of these keywords applied to four predicted PPI datasets for each studied organisms, are 48.32% and 46.49% (by average of four datasets in yeast and worm, respectively. Based on eight top-ranking keywords and co-localization of interacting proteins a set of two knowledge rules were deduced and applied to remove false positive protein pairs. The 'strength', a measure of improvement provided by the rules was defined based on the signal-to-noise ratio and implemented to measure the applicability of knowledge rules applying to the predicted PPI datasets. Depending on the employed PPI-predicting methods, the strength varies between two and ten-fold of randomly removing protein pairs from the datasets. Conclusion Gene Ontology annotations along with the deduced knowledge rules could be implemented to partially

Energy and protein levels in diets containing phytase for broilers from 22 to 42 days of age: performance and nutrient excretion

Directory of Open Access Journals (Sweden)

Adriano Kaneo Nagata

2011-08-01

Full Text Available This study was conducted in order to evaluate the influence of different levels of metabolizable energy and crude protein in diets formulated according to the ideal protein concept with phytase supplementation on performance and nutrient excretion of broilers from 22 to 42 days age. It was used 1,500 Coob lineage broilers at 22 days of age and with initial weight of 833 ± 7g, distributed in completely randomized design in a 3 × 3 + 1 factorial scheme composed of three levels of correct apparent metabolizable energy (2,950; 3,100 and 3,250 kcal/kg, three levels of crude protein (14, 16 and 18% and a control treatment, totaling ten treatments with six repetitions of 25 birds each. All diets, with the exception of the control, were supplemented with phytase. For determination of excretion of pollutants, it was used 180 broilers from the same lineage at 35 days of age,placed in metabolic cages, with ten treatments each one with six repetitions and three birds per experimental unit. The protein and energy levels in diets containing phytase influenced feed intake, weight gain, feed conversion and excretion of nitrogen, phosphorus, calcium, potassium, copper and zinc by the birds. The corrected apparent metabolizable energy level in the diets for broilers in the studied period must be increased up to 3,250 kcal/kg of metabolizable energy and the levels of crude protein, calcium and phosphorus must be reduced down to 18, 0.70 and 0.31%, respectively, provided that supplemented with amino acids and phytase to improve the performance and to reduce excretion of pollutants by birds.
Protein-Protein Interactions Prediction Based on Iterative Clique Extension with Gene Ontology Filtering

Directory of Open Access Journals (Sweden)

Lei Yang

2014-01-01

Full Text Available Cliques (maximal complete subnets in protein-protein interaction (PPI network are an important resource used to analyze protein complexes and functional modules. Clique-based methods of predicting PPI complement the data defection from biological experiments. However, clique-based predicting methods only depend on the topology of network. The false-positive and false-negative interactions in a network usually interfere with prediction. Therefore, we propose a method combining clique-based method of prediction and gene ontology (GO annotations to overcome the shortcoming and improve the accuracy of predictions. According to different GO correcting rules, we generate two predicted interaction sets which guarantee the quality and quantity of predicted protein interactions. The proposed method is applied to the PPI network from the Database of Interacting Proteins (DIP and most of the predicted interactions are verified by another biological database, BioGRID. The predicted protein interactions are appended to the original protein network, which leads to clique extension and shows the significance of biological meaning.
Metabolizable energy of corn hybrids for broiler chickens at different ages

OpenAIRE

Kato,Reinaldo Kanji; Bertechini,Antonio Gilberto; Fassani,Edison José; Brito,Jerônimo Avito Gonçalves de; Castro,Solange de Faria

2011-01-01

We determined the values of apparent metabolizable (AME), apparent corrected (AMEn), true (TME) and true corrected (TMEn) energy of six corn hybrids for broiler chickens in phases 1-7, 8-14, 15-21, 22-28, 29-35 and 36-42 day-old birds, using the substitution method (40%) of reference diet with the test ingredient. Ross-308 male chicks (1,134) were allotted to metabolism cages and the number of birds per experimental unit was adjusted to suit each bird's density stage in the cage, using six re...
Short communication: Prediction of energy requirements of ...

African Journals Online (AJOL)

Data collected on metabolizable energy (ME) intake and growth performance of preruminant female kids of the Murciano-Granadina breed was used to assess the accuracy of the latest U. S. National Research Council (NRC) recommendations to predict their energy requirements. Female kids were fed a milk replacer ...
Application of Machine Learning Approaches for Protein-protein Interactions Prediction.

Science.gov (United States)

Zhang, Mengying; Su, Qiang; Lu, Yi; Zhao, Manman; Niu, Bing

2017-01-01

Proteomics endeavors to study the structures, functions and interactions of proteins. Information of the protein-protein interactions (PPIs) helps to improve our knowledge of the functions and the 3D structures of proteins. Thus determining the PPIs is essential for the study of the proteomics. In this review, in order to study the application of machine learning in predicting PPI, some machine learning approaches such as support vector machine (SVM), artificial neural networks (ANNs) and random forest (RF) were selected, and the examples of its applications in PPIs were listed. SVM and RF are two commonly used methods. Nowadays, more researchers predict PPIs by combining more than two methods. This review presents the application of machine learning approaches in predicting PPI. Many examples of success in identification and prediction in the area of PPI prediction have been discussed, and the PPIs research is still in progress. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Níveis de Energia e Relações Energia: Proteína para Frangos de Corte de 22 a 42 dias de Idade Energy Levels and Metabolizable Energy: Protein Ratio for Male Broiler Chicks from 22 to 42 Days of Age

Directory of Open Access Journals (Sweden)

José Humberto Vilar da Silva

2001-12-01

Full Text Available Níveis de energia metabolizável (EM de 2900, 3100 e 3300 kcal e relações energia: proteína (EM: PB de 128, 148, 168 e 188 kcal/%PB foram avaliados em frangos de corte machos de 22 a 42 dias de idade, distribuídos ao acaso em um esquema fatorial 3 x 4, com quatro repetições de 18 aves por tratamento. O aumento da relação EM: PB apresentou efeito linear decrescente sobre consumo de ração, peso vivo aos 42 dias, ganho de peso, consumo de proteína, consumo de energia metabolizável, peso da carcaça, peso da carne de peito, peso de pernas (coxa+sobrecoxa e elevou linearmente a porcentagem de gordura abdominal na carcaça em todos os níveis de energia. A relação EM: PB de 148 (20,95% PB dentro do nível de EM de 3100 kcal atende às exigências de ótimo crescimento de frangos de corte de 22 a 42 dias de idade, enquanto a relação EM: PB de 188 dentro de todos os níveis de energia estudados se mostrou inadequada. Em virtude do aumento do custo da ração, a redução da relação EM: PB, em rações práticas, deve ser avaliada para otimizar o modelo de produção em que a qualidade da carcaça também deve ser considerada.Levels of metabolizable energy (ME of 2,900, 3,100 and 3,300 kcal and energy to protein ratio (EM: PB of 128, 148, 168 and 188 kcal/%CP were evaluated with male broiler chicks from 22 to 42 days of age, assigned to a completely randomized 3 x 4 factorial design, where each treatment had four replications of 18 birds. Increasing the ME: CP ratio resulted in linear decreasing effect on feed intake, live weight, weight gain, crude protein intake, energy intake, carcass weight, breast meat weight, drumsticks weight and linearly increased abdominal fat percentage within each level of energy. The 148 ME:CP ratio (20,95% CP in the level of 3100 kcal ME met the broilers requirements to optimum growth from 22 to 42 days of age, while 188 ME:CP ratio was inadequate. Because of the high costs, decreasing EM: PB ratio must be
HomPPI: a class of sequence homology based protein-protein interface prediction methods

Directory of Open Access Journals (Sweden)

Dobbs Drena

2011-06-01

Full Text Available Abstract Background Although homology-based methods are among the most widely used methods for predicting the structure and function of proteins, the question as to whether interface sequence conservation can be effectively exploited in predicting protein-protein interfaces has been a subject of debate. Results We studied more than 300,000 pair-wise alignments of protein sequences from structurally characterized protein complexes, including both obligate and transient complexes. We identified sequence similarity criteria required for accurate homology-based inference of interface residues in a query protein sequence. Based on these analyses, we developed HomPPI, a class of sequence homology-based methods for predicting protein-protein interface residues. We present two variants of HomPPI: (i NPS-HomPPI (Non partner-specific HomPPI, which can be used to predict interface residues of a query protein in the absence of knowledge of the interaction partner; and (ii PS-HomPPI (Partner-specific HomPPI, which can be used to predict the interface residues of a query protein with a specific target protein. Our experiments on a benchmark dataset of obligate homodimeric complexes show that NPS-HomPPI can reliably predict protein-protein interface residues in a given protein, with an average correlation coefficient (CC of 0.76, sensitivity of 0.83, and specificity of 0.78, when sequence homologs of the query protein can be reliably identified. NPS-HomPPI also reliably predicts the interface residues of intrinsically disordered proteins. Our experiments suggest that NPS-HomPPI is competitive with several state-of-the-art interface prediction servers including those that exploit the structure of the query proteins. The partner-specific classifier, PS-HomPPI can, on a large dataset of transient complexes, predict the interface residues of a query protein with a specific target, with a CC of 0.65, sensitivity of 0.69, and specificity of 0.70, when homologs of
Equations of prediction for abdominal fat in brown egg-laying hens fed different diets.

Science.gov (United States)

Souza, C; Jaimes, J J B; Gewehr, C E

2017-06-01

The objective was to use noninvasive measurements to formulate equations for predicting the abdominal fat weight of laying hens in a noninvasive manner. Hens were fed with different diets; the external body measurements of birds were used as regressors. We used 288 Hy-Line Brown laying hens, distributed in a completely randomized design in a factorial arrangement, submitted for 16 wk to 2 metabolizable energy levels (2,550 and 2,800 kcal/kg) and 3 levels of crude protein in the diet (150, 160, and 170 g/kg), totaling 6 treatments, with 48 hens each. Sixteen hens per treatment of 92 wk age were utilized to evaluate body weight, bird length, tarsus and sternum, greater and lesser diameter of the tarsus, and abdominal fat weight, after slaughter. The equations were obtained by using measures evaluated with regressors through simple and multiple linear regression with the stepwise method of indirect elimination (backward), with P abdominal fat as predicted by the equations and observed values for each bird were subjected to Pearson's correlation analysis. The equations generated by energy levels showed coefficients of determination of 0.50 and 0.74 for 2,800 and 2,550 kcal/kg of metabolizable energy, respectively, with correlation coefficients of 0.71 and 0.84, with a highly significant correlation between the calculated and observed values of abdominal fat. For protein levels of 150, 160, and 170 g/kg in the diet, it was possible to obtain coefficients of determination of 0.75, 0.57, and 0.61, with correlation coefficients of 0.86, 0.75, and 0.78, respectively. Regarding the general equation for predicting abdominal fat weight, the coefficient of determination was 0.62; the correlation coefficient was 0.79. The equations for predicting abdominal fat weight in laying hens, based on external measurements of the birds, showed positive coefficients of determination and correlation coefficients, thus allowing researchers to determine abdominal fat weight in vivo. �
Protein Sorting Prediction

DEFF Research Database (Denmark)

Nielsen, Henrik

2017-01-01

and drawbacks of each of these approaches is described through many examples of methods that predict secretion, integration into membranes, or subcellular locations in general. The aim of this chapter is to provide a user-level introduction to the field with a minimum of computational theory.......Many computational methods are available for predicting protein sorting in bacteria. When comparing them, it is important to know that they can be grouped into three fundamentally different approaches: signal-based, global-property-based and homology-based prediction. In this chapter, the strengths...
Prediction of digestible and metabolizable energy content and standardized ileal amino Acid digestibility in wheat shorts and red dog for growing pigs.

Science.gov (United States)

Huang, Q; Piao, X S; Ren, P; Li, D F

2012-12-01

Two experiments were conducted to evaluate the effects of chemical composition of wheat shorts and red dog on energy and amino acid digestibility in growing pigs and to establish prediction models to estimate their digestible (DE) and metabolizable (ME) energy content and as well as their standardized ileal digestible (SID) amino acid content. For Exp. 1, sixteen diets were fed to thirty-two growing pigs according to a completely randomized design during three successive periods. The basal diet was based on corn and soybean meal while the other fifteen diets contained 28.8% wheat shorts (N = 7) or red dog (N = 8), added at the expense of corn and soybean meal. Over the three periods, each diet was fed to six pigs with each diet being fed to two pigs during each period. The apparent total tract digestibility (ATTD) of energy in wheat shorts and red dog averaged 75.1 and 87.9%. The DE values of wheat shorts and red dog averaged 13.8 MJ/kg (range 13.1 to 15.0 MJ/kg) and 15.1 MJ/kg (range 13.3 to 16.6 MJ/kg) of dry matter, respectively. For Exp. 2, twelve growing pigs were allotted to two 6×6 Latin Square Designs with six periods. Ten of the diets were formulated based on 60% wheat shorts or red dog and the remaining two diets were nitrogen-free diets based on cornstarch and sucrose. Chromic oxide (0.3%) was used as an indigestible marker in all diets. There were no differences (p>0.05) in SID values for the amino acids in wheat shorts and red dog except for lysine and methionine. Apparent ileal digestibility (AID) and SID values for lysine in different sources of wheat shorts or red dog, which averaged 78.1 and 87.8%, showed more variation than either methionine or tryptophan. A stepwise regression was performed to establish DE, ME and amino acid digestibility prediction models. Data indicated that fiber content and amino acid concentrations were good indicators to predict energy values and amino acid digestibility, respectively. The present study confirms the large
Toxicological relationships between proteins obtained from protein target predictions of large toxicity databases

International Nuclear Information System (INIS)

Nigsch, Florian; Mitchell, John B.O.

2008-01-01

The combination of models for protein target prediction with large databases containing toxicological information for individual molecules allows the derivation of 'toxiclogical' profiles, i.e., to what extent are molecules of known toxicity predicted to interact with a set of protein targets. To predict protein targets of drug-like and toxic molecules, we built a computational multiclass model using the Winnow algorithm based on a dataset of protein targets derived from the MDL Drug Data Report. A 15-fold Monte Carlo cross-validation using 50% of each class for training, and the remaining 50% for testing, provided an assessment of the accuracy of that model. We retained the 3 top-ranking predictions and found that in 82% of all cases the correct target was predicted within these three predictions. The first prediction was the correct one in almost 70% of cases. A model built on the whole protein target dataset was then used to predict the protein targets for 150 000 molecules from the MDL Toxicity Database. We analysed the frequency of the predictions across the panel of protein targets for experimentally determined toxicity classes of all molecules. This allowed us to identify clusters of proteins related by their toxicological profiles, as well as toxicities that are related. Literature-based evidence is provided for some specific clusters to show the relevance of the relationships identified
Effects of dietary protein level on growth, health and physiological parameters in growing-furring mink

DEFF Research Database (Denmark)

Damgaard, Birthe Marie; Larsen, Peter F.; Clausen, Tove

2012-01-01

The aim of the study was to investigate the effects of the dietary protein level and the feeding strategy on growth, health and physiological blood and liver parameters in growing-furring male mink. Effects of dietary protein levels ranging from 22% of metabolizable energy (MEp) to experimental p...
HKC: An Algorithm to Predict Protein Complexes in Protein-Protein Interaction Networks

Directory of Open Access Journals (Sweden)

Xiaomin Wang

2011-01-01

Full Text Available With the availability of more and more genome-scale protein-protein interaction (PPI networks, research interests gradually shift to Systematic Analysis on these large data sets. A key topic is to predict protein complexes in PPI networks by identifying clusters that are densely connected within themselves but sparsely connected with the rest of the network. In this paper, we present a new topology-based algorithm, HKC, to detect protein complexes in genome-scale PPI networks. HKC mainly uses the concepts of highest k-core and cohesion to predict protein complexes by identifying overlapping clusters. The experiments on two data sets and two benchmarks show that our algorithm has relatively high F-measure and exhibits better performance compared with some other methods.
Molecular spectroscopic features of protein in newly developed chickpea: Relationship with protein chemical profile and metabolism in the rumen and intestine of dairy cows.

Science.gov (United States)

Sun, Baoli; Khan, Nazir Ahmad; Yu, Peiqiang

2018-05-05

The first aim of this study was to investigate the nutritional value of crude protein (CP) in CDC [Crop Development Centre (CDC), University of Saskatchewan] chickpea varieties (Frontier kabuli and Corinne desi) in comparison with a CDC barley variety in terms of: 1) CP chemical profile and subfractions; (2) in situ rumen degradation kinetics and intestinal digestibility of CP; 2) metabolizable protein (MP) supply to dairy cows; and (3) protein molecular structure characteristics using advanced molecular spectroscopy. The second aim was to quantify the relationship between protein molecular spectral characteristics and CP subfractions, in situ rumen CP degradation characteristics, intestinal digestibility of CP, and MP supply to dairy cows. Samples (n=4) of each variety, from two consecutive years were analyzed. Chickpeas had higher (Pmolecular spectral data of chickpeas can be distinguished from the barley. The two chickpeas did not differ in CP content, and any of the measured in situ degradation and molecular spectral characteristics of protein. The content of RUP was positively (r=0.94, Pmolecular spectroscopy can be used to rapidly characterize feed protein molecular structures and predict their digestibility and nutritive value. Crown Copyright © 2018. Published by Elsevier B.V. All rights reserved.
Composição química e energia metabolizável de híbridos de milho para frangos de corte Chemical composition and metabolizable energy of corn hybrids for broilers

Directory of Open Access Journals (Sweden)

Rodrigo de Oliveira Vieira

2007-08-01

. Trial 4 consisted of 7 treatments, being 6 test diets (corn varieties and the reference. The hybrid corns, replaced 40% of the reference diets in all the trials.. A completely randomized design with five replicates of five birds per cage was used. The diets and water were offered ad libitum for a 7-day period, being four days for adaptation and three for days for excreta collection. A variation of 32% in crude protein - CP (7.79% vs. 11.45%, dry matter basis was found. The values of gross energy (GE presented a variation of 5.2%. The higher value observed was 4,668 kcal and the lower value 4,425 kcal/kg. The average value for corrected apparent metabolizable energy (AMEn was 3,744 kcal/kg, with a variation of 15.15% (3,405 to 4,013 kcal/kg. However, it was observed that the two hybrids presenting this variation (608 kcal AMEn/kg DM had similar GE values (3914 and 3931 kcal GE/kg of DM; 0.36% variation. This variation in the AMEn can possibly be accounted on the coefficient of gross energy metabolization which was 75% for the hybrid with the lowest AMEn and 88% for the hybrid with the highest AMEn. Despite of corn being an energy source , the evaluation of its CP content is important due to the considerable variation in the protein values of the different hybrids found currently. The same is valid for the energy values.
MEGADOCK-Web: an integrated database of high-throughput structure-based protein-protein interaction predictions.

Science.gov (United States)

Hayashi, Takanori; Matsuzaki, Yuri; Yanagisawa, Keisuke; Ohue, Masahito; Akiyama, Yutaka

2018-05-08

Protein-protein interactions (PPIs) play several roles in living cells, and computational PPI prediction is a major focus of many researchers. The three-dimensional (3D) structure and binding surface are important for the design of PPI inhibitors. Therefore, rigid body protein-protein docking calculations for two protein structures are expected to allow elucidation of PPIs different from known complexes in terms of 3D structures because known PPI information is not explicitly required. We have developed rapid PPI prediction software based on protein-protein docking, called MEGADOCK. In order to fully utilize the benefits of computational PPI predictions, it is necessary to construct a comprehensive database to gather prediction results and their predicted 3D complex structures and to make them easily accessible. Although several databases exist that provide predicted PPIs, the previous databases do not contain a sufficient number of entries for the purpose of discovering novel PPIs. In this study, we constructed an integrated database of MEGADOCK PPI predictions, named MEGADOCK-Web. MEGADOCK-Web provides more than 10 times the number of PPI predictions than previous databases and enables users to conduct PPI predictions that cannot be found in conventional PPI prediction databases. In MEGADOCK-Web, there are 7528 protein chains and 28,331,628 predicted PPIs from all possible combinations of those proteins. Each protein structure is annotated with PDB ID, chain ID, UniProt AC, related KEGG pathway IDs, and known PPI pairs. Additionally, MEGADOCK-Web provides four powerful functions: 1) searching precalculated PPI predictions, 2) providing annotations for each predicted protein pair with an experimentally known PPI, 3) visualizing candidates that may interact with the query protein on biochemical pathways, and 4) visualizing predicted complex structures through a 3D molecular viewer. MEGADOCK-Web provides a huge amount of comprehensive PPI predictions based on
New model for predicting energy requirements of children during catch-up growth developed using doubly labeled water

Energy Technology Data Exchange (ETDEWEB)

Fjeld, C R; Schoeller, D A; Brown, K H

1989-05-01

Energy partitioned to maintenance plus activity, tissue synthesis, and storage was measured in 41 children in early recovery (W/L (wt/length) less than 5th percentile) from severe protein-energy malnutrition and in late recovery (W/L = 25th percentile) to determine energy requirements during catch-up growth. Metabolizable energy intake was measured by bomb calorimetry and metabolic collections. Energy expended (means +/- SD) for maintenance and activity estimated by the doubly labeled water method was 97 +/- 12 kcal/kg FFM (fat-free mass) in early recovery and 98 +/- 12 kcal/kg FFM in late recovery (p greater than 0.5). Energy stored was 5-6 kcal/g of wt gain. Tissue synthesis increased energy expenditure by 1 +/- 0.7 kcal/g gain in both early and late recovery. From these data a mathematical model was developed to predict energy requirements for children during catch-up growth as a function of initial body composition and rate and composition of wt gain. The model for predicting metabolizable energy requirements is ((98 x FFM) + A (11.1 B + 2.2 C)), kcal/kg.d, where FFM is fat-free mass expressed as a percentage of body wt, A is wt gain (g/kg.d), B and C are percentage of wt gain/100 as fat and FFM, respectively. The model was tested retrospectively in separate studies of malnourished children.
Computational prediction of protein hot spot residues.

Science.gov (United States)

Morrow, John Kenneth; Zhang, Shuxing

2012-01-01

Most biological processes involve multiple proteins interacting with each other. It has been recently discovered that certain residues in these protein-protein interactions, which are called hot spots, contribute more significantly to binding affinity than others. Hot spot residues have unique and diverse energetic properties that make them challenging yet important targets in the modulation of protein-protein complexes. Design of therapeutic agents that interact with hot spot residues has proven to be a valid methodology in disrupting unwanted protein-protein interactions. Using biological methods to determine which residues are hot spots can be costly and time consuming. Recent advances in computational approaches to predict hot spots have incorporated a myriad of features, and have shown increasing predictive successes. Here we review the state of knowledge around protein-protein interactions, hot spots, and give an overview of multiple in silico prediction techniques of hot spot residues.
Prediction of protein-protein interactions between viruses and human by an SVM model

Directory of Open Access Journals (Sweden)

Cui Guangyu

2012-05-01

Full Text Available Abstract Background Several computational methods have been developed to predict protein-protein interactions from amino acid sequences, but most of those methods are intended for the interactions within a species rather than for interactions across different species. Methods for predicting interactions between homogeneous proteins are not appropriate for finding those between heterogeneous proteins since they do not distinguish the interactions between proteins of the same species from those of different species. Results We developed a new method for representing a protein sequence of variable length in a frequency vector of fixed length, which encodes the relative frequency of three consecutive amino acids of a sequence. We built a support vector machine (SVM model to predict human proteins that interact with virus proteins. In two types of viruses, human papillomaviruses (HPV and hepatitis C virus (HCV, our SVM model achieved an average accuracy above 80%, which is higher than that of another SVM model with a different representation scheme. Using the SVM model and Gene Ontology (GO annotations of proteins, we predicted new interactions between virus proteins and human proteins. Conclusions Encoding the relative frequency of amino acid triplets of a protein sequence is a simple yet powerful representation method for predicting protein-protein interactions across different species. The representation method has several advantages: (1 it enables a prediction model to achieve a better performance than other representations, (2 it generates feature vectors of fixed length regardless of the sequence length, and (3 the same representation is applicable to different types of proteins.
Prediction and characterization of protein-protein interaction networks in swine

Directory of Open Access Journals (Sweden)

Wang Fen

2012-01-01

Full Text Available Abstract Background Studying the large-scale protein-protein interaction (PPI network is important in understanding biological processes. The current research presents the first PPI map of swine, which aims to give new insights into understanding their biological processes. Results We used three methods, Interolog-based prediction of porcine PPI network, domain-motif interactions from structural topology-based prediction of porcine PPI network and motif-motif interactions from structural topology-based prediction of porcine PPI network, to predict porcine protein interactions among 25,767 porcine proteins. We predicted 20,213, 331,484, and 218,705 porcine PPIs respectively, merged the three results into 567,441 PPIs, constructed four PPI networks, and analyzed the topological properties of the porcine PPI networks. Our predictions were validated with Pfam domain annotations and GO annotations. Averages of 70, 10,495, and 863 interactions were related to the Pfam domain-interacting pairs in iPfam database. For comparison, randomized networks were generated, and averages of only 4.24, 66.79, and 44.26 interactions were associated with Pfam domain-interacting pairs in iPfam database. In GO annotations, we found 52.68%, 75.54%, 27.20% of the predicted PPIs sharing GO terms respectively. However, the number of PPI pairs sharing GO terms in the 10,000 randomized networks reached 52.68%, 75.54%, 27.20% is 0. Finally, we determined the accuracy and precision of the methods. The methods yielded accuracies of 0.92, 0.53, and 0.50 at precisions of about 0.93, 0.74, and 0.75, respectively. Conclusion The results reveal that the predicted PPI networks are considerably reliable. The present research is an important pioneering work on protein function research. The porcine PPI data set, the confidence score of each interaction and a list of related data are available at (http://pppid.biositemap.com/.

Kecernaan protein dan energi metabolis akibat pemberian zat aditif cair buah naga merah (Hylocereus polyrhizus pada burung puyuh japonica betina umur 16-50 hari

Directory of Open Access Journals (Sweden)

Meina Yuniarti

2015-12-01

Full Text Available Digestibility of crude protein and energy is used to measure digestibility in poultry, digestible shows of feed substances absorbed by the body which will affect the productivity of quail. This experiment was conducted to study the effect of red dragon fruit liquid additif (Hylocereus polyrhizus, digestibility protein and metabolizable energy by quail female age 16-50 days. Experiment used 200 japanese quails females, 7 weeks age with average body weight of 13.61±0.49 g. The study was conducted in battery cages. The experiment used Completely Randomized Design with 4 treatments and 5 replications: T0 (control, T1 (Award liquid additives red dragon fruit twice a day, T2 (one a day and T3 (two days. The dose of a liquid additive is 5 ml/quail. Observation of digestibility of crude protein (KcPK and the energy carried by the method of total collection for 3 days, measurements using a bomb calorimeter gross energy and protein analysis using Kjeldahl method. Data were analyzed using a variety of test F at the level 5%, followed by Duncan's Multiple Range Test (UJBD there is significant effect of the treatment each treatment was showed liquid additives red dragon fruit was not significant (P> 0.05 on crude protein digestibility and apparent metabolizable energy. The conclusion, the given of liquid additives red dragon fruit did not increase digestibility of crude protein and apparent metabolizable energy. Keywords: quail, red dragon fruit, digestibility of crude protein
Refining intra-protein contact prediction by graph analysis

Directory of Open Access Journals (Sweden)

Eyal Eran

2007-05-01

Full Text Available Abstract Background Accurate prediction of intra-protein residue contacts from sequence information will allow the prediction of protein structures. Basic predictions of such specific contacts can be further refined by jointly analyzing predicted contacts, and by adding information on the relative positions of contacts in the protein primary sequence. Results We introduce a method for graph analysis refinement of intra-protein contacts, termed GARP. Our previously presented intra-contact prediction method by means of pair-to-pair substitution matrix (P2PConPred was used to test the GARP method. In our approach, the top contact predictions obtained by a basic prediction method were used as edges to create a weighted graph. The edges were scored by a mutual clustering coefficient that identifies highly connected graph regions, and by the density of edges between the sequence regions of the edge nodes. A test set of 57 proteins with known structures was used to determine contacts. GARP improves the accuracy of the P2PConPred basic prediction method in whole proteins from 12% to 18%. Conclusion Using a simple approach we increased the contact prediction accuracy of a basic method by 1.5 times. Our graph approach is simple to implement, can be used with various basic prediction methods, and can provide input for further downstream analyses.
Valor nutricional e energia metabolizável de subprodutos do trigo utilizados para alimentação de suínos em crescimento

Directory of Open Access Journals (Sweden)

William Rui Wesendonck

2013-02-01

Full Text Available O objetivo deste trabalho foi avaliar o valor nutricional e energético de subprodutos do trigo, em dietas para suínos em crescimento, e obter equações de predição da energia metabolizável. Foram utilizados 36 suínos machos, castrados, alojados em gaiolas metabólicas individuais. Realizou-se a coleta total de fezes e urina em dois períodos de dez dias: cinco para adaptação e cinco para coleta. Utilizou-se o delineamento de blocos ao acaso, tendo-se considerado o período de coleta como bloco, com seis tratamentos e seis repetições. A dieta referência foi substituída em 30% por um dos subprodutos testados: farinheta, farelo fino, farelo de trigo, farelo grosso e farelo grosso moído; este último usado para avaliar a influência da granulometria na digestibilidade. A fibra bruta foi a variável que proporcionou a melhor estimativa da energia metabolizável. O farelo fino foi superior em energia digestível e metabolizável, em comparação ao farelo grosso moído. O farelo grosso moído apresentou os menores coeficientes de digestibilidade, e a diminuição de seu diâmetro geométrico médio não aumentou a digestibilidade dos nutrientes e da energia. Entre os subprodutos avaliados, a farinheta apresenta maior energia digestível, energia metabolizável e proteína digestível, o que mostra elevado potencial para utilização em dietas para suínos em crescimento.
A domain-based approach to predict protein-protein interactions

Directory of Open Access Journals (Sweden)

Resat Haluk

2007-06-01

Full Text Available Abstract Background Knowing which proteins exist in a certain organism or cell type and how these proteins interact with each other are necessary for the understanding of biological processes at the whole cell level. The determination of the protein-protein interaction (PPI networks has been the subject of extensive research. Despite the development of reasonably successful methods, serious technical difficulties still exist. In this paper we present DomainGA, a quantitative computational approach that uses the information about the domain-domain interactions to predict the interactions between proteins. Results DomainGA is a multi-parameter optimization method in which the available PPI information is used to derive a quantitative scoring scheme for the domain-domain pairs. Obtained domain interaction scores are then used to predict whether a pair of proteins interacts. Using the yeast PPI data and a series of tests, we show the robustness and insensitivity of the DomainGA method to the selection of the parameter sets, score ranges, and detection rules. Our DomainGA method achieves very high explanation ratios for the positive and negative PPIs in yeast. Based on our cross-verification tests on human PPIs, comparison of the optimized scores with the structurally observed domain interactions obtained from the iPFAM database, and sensitivity and specificity analysis; we conclude that our DomainGA method shows great promise to be applicable across multiple organisms. Conclusion We envision the DomainGA as a first step of a multiple tier approach to constructing organism specific PPIs. As it is based on fundamental structural information, the DomainGA approach can be used to create potential PPIs and the accuracy of the constructed interaction template can be further improved using complementary methods. Explanation ratios obtained in the reported test case studies clearly show that the false prediction rates of the template networks constructed
Molecular spectroscopic features of protein in newly developed chickpea: Relationship with protein chemical profile and metabolism in the rumen and intestine of dairy cows

Science.gov (United States)

Sun, Baoli; Khan, Nazir Ahmad; Yu, Peiqiang

2018-05-01

The first aim of this study was to investigate the nutritional value of crude protein (CP) in CDC [Crop Development Centre (CDC), University of Saskatchewan] chickpea varieties (Frontier kabuli and Corinne desi) in comparison with a CDC barley variety in terms of: 1) CP chemical profile and subfractions; (2) in situ rumen degradation kinetics and intestinal digestibility of CP; 2) metabolizable protein (MP) supply to dairy cows; and (3) protein molecular structure characteristics using advanced molecular spectroscopy. The second aim was to quantify the relationship between protein molecular spectral characteristics and CP subfractions, in situ rumen CP degradation characteristics, intestinal digestibility of CP, and MP supply to dairy cows. Samples (n = 4) of each variety, from two consecutive years were analyzed. Chickpeas had higher (P content (21.71-22.11 vs 12.96% DM), with higher (P content, and any of the measured in situ degradation and molecular spectral characteristics of protein. The content of RUP was positively (r = 0.94, P content of CP (R2 = 0.91) D-fraction (R2 = 0.82), RDP (R2 = 0.77), RUP (R2 = 0.77), TDP (R2 = 0.98), MP (R2 = 0.80), and FMV (R2 = 0.80) can be predicted from amide II peak height. Despite extensive ruminal degradation, chickpea is a good source of MP for dairy cows, and molecular spectroscopy can be used to rapidly characterize feed protein molecular structures and predict their digestibility and nutritive value.
Calculating the metabolizable energy of macronutrients: a critical review of Atwater's results.

Science.gov (United States)

Sánchez-Peña, M Judith; Márquez-Sandoval, Fabiola; Ramírez-Anguiano, Ana C; Velasco-Ramírez, Sandra F; Macedo-Ojeda, Gabriela; González-Ortiz, Luis J

2017-01-01

The current values for metabolizable energy of macronutrients were proposed in 1910. Since then, however, efforts to revise these values have been practically absent, creating a crucial need to carry out a critical analysis of the experimental methodology and results that form the basis of these values. Presented here is an exhaustive analysis of Atwater's work on this topic, showing evidence of considerable weaknesses that compromise the validity of his results. These weaknesses include the following: (1) the doubtful representativeness of Atwater's subjects, their activity patterns, and their diets; (2) the extremely short duration of the experiments; (3) the uncertainty about which fecal and urinary excretions contain the residues of each ingested food; (4) the uncertainty about whether or not the required nitrogen balance in individuals was reached during experiments; (5) the numerous experiments carried out without valid preliminary experiments; (6) the imprecision affecting Atwater's experimental measurements; and (7) the numerous assumptions and approximations, along with the lack of information, characterizing Atwater's studies. This review presents specific guidelines for establishing new experimental procedures to estimate more precise and/or more accurate values for the metabolizable energy of macronutrients. The importance of estimating these values in light of their possible dependence on certain nutritional parameters and/or physical activity patterns of individuals is emphasized. The use of more precise values would allow better management of the current overweight and obesity epidemic. © The Author(s) 2016. Published by Oxford University Press on behalf of the International Life Sciences Institute. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Protein-protein interaction site predictions with minimum covariance determinant and Mahalanobis distance.

Science.gov (United States)

Qiu, Zhijun; Zhou, Bo; Yuan, Jiangfeng

2017-11-21

Protein-protein interaction site (PPIS) prediction must deal with the diversity of interaction sites that limits their prediction accuracy. Use of proteins with unknown or unidentified interactions can also lead to missing interfaces. Such data errors are often brought into the training dataset. In response to these two problems, we used the minimum covariance determinant (MCD) method to refine the training data to build a predictor with better performance, utilizing its ability of removing outliers. In order to predict test data in practice, a method based on Mahalanobis distance was devised to select proper test data as input for the predictor. With leave-one-validation and independent test, after the Mahalanobis distance screening, our method achieved higher performance according to Matthews correlation coefficient (MCC), although only a part of test data could be predicted. These results indicate that data refinement is an efficient approach to improve protein-protein interaction site prediction. By further optimizing our method, it is hopeful to develop predictors of better performance and wide range of application. Copyright © 2017 Elsevier Ltd. All rights reserved.
Deep learning methods for protein torsion angle prediction.

Science.gov (United States)

Li, Haiou; Hou, Jie; Adhikari, Badri; Lyu, Qiang; Cheng, Jianlin

2017-09-18

Deep learning is one of the most powerful machine learning methods that has achieved the state-of-the-art performance in many domains. Since deep learning was introduced to the field of bioinformatics in 2012, it has achieved success in a number of areas such as protein residue-residue contact prediction, secondary structure prediction, and fold recognition. In this work, we developed deep learning methods to improve the prediction of torsion (dihedral) angles of proteins. We design four different deep learning architectures to predict protein torsion angles. The architectures including deep neural network (DNN) and deep restricted Boltzmann machine (DRBN), deep recurrent neural network (DRNN) and deep recurrent restricted Boltzmann machine (DReRBM) since the protein torsion angle prediction is a sequence related problem. In addition to existing protein features, two new features (predicted residue contact number and the error distribution of torsion angles extracted from sequence fragments) are used as input to each of the four deep learning architectures to predict phi and psi angles of protein backbone. The mean absolute error (MAE) of phi and psi angles predicted by DRNN, DReRBM, DRBM and DNN is about 20-21° and 29-30° on an independent dataset. The MAE of phi angle is comparable to the existing methods, but the MAE of psi angle is 29°, 2° lower than the existing methods. On the latest CASP12 targets, our methods also achieved the performance better than or comparable to a state-of-the art method. Our experiment demonstrates that deep learning is a valuable method for predicting protein torsion angles. The deep recurrent network architecture performs slightly better than deep feed-forward architecture, and the predicted residue contact number and the error distribution of torsion angles extracted from sequence fragments are useful features for improving prediction accuracy.
An Overview of Practical Applications of Protein Disorder Prediction and Drive for Faster, More Accurate Predictions.

Science.gov (United States)

Deng, Xin; Gumm, Jordan; Karki, Suman; Eickholt, Jesse; Cheng, Jianlin

2015-07-07

Protein disordered regions are segments of a protein chain that do not adopt a stable structure. Thus far, a variety of protein disorder prediction methods have been developed and have been widely used, not only in traditional bioinformatics domains, including protein structure prediction, protein structure determination and function annotation, but also in many other biomedical fields. The relationship between intrinsically-disordered proteins and some human diseases has played a significant role in disorder prediction in disease identification and epidemiological investigations. Disordered proteins can also serve as potential targets for drug discovery with an emphasis on the disordered-to-ordered transition in the disordered binding regions, and this has led to substantial research in drug discovery or design based on protein disordered region prediction. Furthermore, protein disorder prediction has also been applied to healthcare by predicting the disease risk of mutations in patients and studying the mechanistic basis of diseases. As the applications of disorder prediction increase, so too does the need to make quick and accurate predictions. To fill this need, we also present a new approach to predict protein residue disorder using wide sequence windows that is applicable on the genomic scale.
An Overview of Practical Applications of Protein Disorder Prediction and Drive for Faster, More Accurate Predictions

Directory of Open Access Journals (Sweden)

Xin Deng

2015-07-01

Full Text Available Protein disordered regions are segments of a protein chain that do not adopt a stable structure. Thus far, a variety of protein disorder prediction methods have been developed and have been widely used, not only in traditional bioinformatics domains, including protein structure prediction, protein structure determination and function annotation, but also in many other biomedical fields. The relationship between intrinsically-disordered proteins and some human diseases has played a significant role in disorder prediction in disease identification and epidemiological investigations. Disordered proteins can also serve as potential targets for drug discovery with an emphasis on the disordered-to-ordered transition in the disordered binding regions, and this has led to substantial research in drug discovery or design based on protein disordered region prediction. Furthermore, protein disorder prediction has also been applied to healthcare by predicting the disease risk of mutations in patients and studying the mechanistic basis of diseases. As the applications of disorder prediction increase, so too does the need to make quick and accurate predictions. To fill this need, we also present a new approach to predict protein residue disorder using wide sequence windows that is applicable on the genomic scale.
Topology of membrane proteins-predictions, limitations and variations.

Science.gov (United States)

Tsirigos, Konstantinos D; Govindarajan, Sudha; Bassot, Claudio; Västermark, Åke; Lamb, John; Shu, Nanjiang; Elofsson, Arne

2017-10-26

Transmembrane proteins perform a variety of important biological functions necessary for the survival and growth of the cells. Membrane proteins are built up by transmembrane segments that span the lipid bilayer. The segments can either be in the form of hydrophobic alpha-helices or beta-sheets which create a barrel. A fundamental aspect of the structure of transmembrane proteins is the membrane topology, that is, the number of transmembrane segments, their position in the protein sequence and their orientation in the membrane. Along these lines, many predictive algorithms for the prediction of the topology of alpha-helical and beta-barrel transmembrane proteins exist. The newest algorithms obtain an accuracy close to 80% both for alpha-helical and beta-barrel transmembrane proteins. However, lately it has been shown that the simplified picture presented when describing a protein family by its topology is limited. To demonstrate this, we highlight examples where the topology is either not conserved in a protein superfamily or where the structure cannot be described solely by the topology of a protein. The prediction of these non-standard features from sequence alone was not successful until the recent revolutionary progress in 3D-structure prediction of proteins. Copyright © 2017 Elsevier Ltd. All rights reserved.
Text mining improves prediction of protein functional sites.

Directory of Open Access Journals (Sweden)

Karin M Verspoor

Full Text Available We present an approach that integrates protein structure analysis and text mining for protein functional site prediction, called LEAP-FS (Literature Enhanced Automated Prediction of Functional Sites. The structure analysis was carried out using Dynamics Perturbation Analysis (DPA, which predicts functional sites at control points where interactions greatly perturb protein vibrations. The text mining extracts mentions of residues in the literature, and predicts that residues mentioned are functionally important. We assessed the significance of each of these methods by analyzing their performance in finding known functional sites (specifically, small-molecule binding sites and catalytic sites in about 100,000 publicly available protein structures. The DPA predictions recapitulated many of the functional site annotations and preferentially recovered binding sites annotated as biologically relevant vs. those annotated as potentially spurious. The text-based predictions were also substantially supported by the functional site annotations: compared to other residues, residues mentioned in text were roughly six times more likely to be found in a functional site. The overlap of predictions with annotations improved when the text-based and structure-based methods agreed. Our analysis also yielded new high-quality predictions of many functional site residues that were not catalogued in the curated data sources we inspected. We conclude that both DPA and text mining independently provide valuable high-throughput protein functional site predictions, and that integrating the two methods using LEAP-FS further improves the quality of these predictions.
Text Mining Improves Prediction of Protein Functional Sites

Science.gov (United States)

Cohn, Judith D.; Ravikumar, Komandur E.

2012-01-01

We present an approach that integrates protein structure analysis and text mining for protein functional site prediction, called LEAP-FS (Literature Enhanced Automated Prediction of Functional Sites). The structure analysis was carried out using Dynamics Perturbation Analysis (DPA), which predicts functional sites at control points where interactions greatly perturb protein vibrations. The text mining extracts mentions of residues in the literature, and predicts that residues mentioned are functionally important. We assessed the significance of each of these methods by analyzing their performance in finding known functional sites (specifically, small-molecule binding sites and catalytic sites) in about 100,000 publicly available protein structures. The DPA predictions recapitulated many of the functional site annotations and preferentially recovered binding sites annotated as biologically relevant vs. those annotated as potentially spurious. The text-based predictions were also substantially supported by the functional site annotations: compared to other residues, residues mentioned in text were roughly six times more likely to be found in a functional site. The overlap of predictions with annotations improved when the text-based and structure-based methods agreed. Our analysis also yielded new high-quality predictions of many functional site residues that were not catalogued in the curated data sources we inspected. We conclude that both DPA and text mining independently provide valuable high-throughput protein functional site predictions, and that integrating the two methods using LEAP-FS further improves the quality of these predictions. PMID:22393388
Protein subcellular localization prediction using artificial intelligence technology.

Science.gov (United States)

Nair, Rajesh; Rost, Burkhard

2008-01-01

Proteins perform many important tasks in living organisms, such as catalysis of biochemical reactions, transport of nutrients, and recognition and transmission of signals. The plethora of aspects of the role of any particular protein is referred to as its "function." One aspect of protein function that has been the target of intensive research by computational biologists is its subcellular localization. Proteins must be localized in the same subcellular compartment to cooperate toward a common physiological function. Aberrant subcellular localization of proteins can result in several diseases, including kidney stones, cancer, and Alzheimer's disease. To date, sequence homology remains the most widely used method for inferring the function of a protein. However, the application of advanced artificial intelligence (AI)-based techniques in recent years has resulted in significant improvements in our ability to predict the subcellular localization of a protein. The prediction accuracy has risen steadily over the years, in large part due to the application of AI-based methods such as hidden Markov models (HMMs), neural networks (NNs), and support vector machines (SVMs), although the availability of larger experimental datasets has also played a role. Automatic methods that mine textual information from the biological literature and molecular biology databases have considerably sped up the process of annotation for proteins for which some information regarding function is available in the literature. State-of-the-art methods based on NNs and HMMs can predict the presence of N-terminal sorting signals extremely accurately. Ab initio methods that predict subcellular localization for any protein sequence using only the native amino acid sequence and features predicted from the native sequence have shown the most remarkable improvements. The prediction accuracy of these methods has increased by over 30% in the past decade. The accuracy of these methods is now on par with
Protein-protein interaction site predictions with three-dimensional probability distributions of interacting atoms on protein surfaces.

Directory of Open Access Journals (Sweden)

Ching-Tai Chen

Full Text Available Protein-protein interactions are key to many biological processes. Computational methodologies devised to predict protein-protein interaction (PPI sites on protein surfaces are important tools in providing insights into the biological functions of proteins and in developing therapeutics targeting the protein-protein interaction sites. One of the general features of PPI sites is that the core regions from the two interacting protein surfaces are complementary to each other, similar to the interior of proteins in packing density and in the physicochemical nature of the amino acid composition. In this work, we simulated the physicochemical complementarities by constructing three-dimensional probability density maps of non-covalent interacting atoms on the protein surfaces. The interacting probabilities were derived from the interior of known structures. Machine learning algorithms were applied to learn the characteristic patterns of the probability density maps specific to the PPI sites. The trained predictors for PPI sites were cross-validated with the training cases (consisting of 432 proteins and were tested on an independent dataset (consisting of 142 proteins. The residue-based Matthews correlation coefficient for the independent test set was 0.423; the accuracy, precision, sensitivity, specificity were 0.753, 0.519, 0.677, and 0.779 respectively. The benchmark results indicate that the optimized machine learning models are among the best predictors in identifying PPI sites on protein surfaces. In particular, the PPI site prediction accuracy increases with increasing size of the PPI site and with increasing hydrophobicity in amino acid composition of the PPI interface; the core interface regions are more likely to be recognized with high prediction confidence. The results indicate that the physicochemical complementarity patterns on protein surfaces are important determinants in PPIs, and a substantial portion of the PPI sites can be predicted
Protein-Protein Interaction Site Predictions with Three-Dimensional Probability Distributions of Interacting Atoms on Protein Surfaces

Science.gov (United States)

Chen, Ching-Tai; Peng, Hung-Pin; Jian, Jhih-Wei; Tsai, Keng-Chang; Chang, Jeng-Yih; Yang, Ei-Wen; Chen, Jun-Bo; Ho, Shinn-Ying; Hsu, Wen-Lian; Yang, An-Suei

2012-01-01

Protein-protein interactions are key to many biological processes. Computational methodologies devised to predict protein-protein interaction (PPI) sites on protein surfaces are important tools in providing insights into the biological functions of proteins and in developing therapeutics targeting the protein-protein interaction sites. One of the general features of PPI sites is that the core regions from the two interacting protein surfaces are complementary to each other, similar to the interior of proteins in packing density and in the physicochemical nature of the amino acid composition. In this work, we simulated the physicochemical complementarities by constructing three-dimensional probability density maps of non-covalent interacting atoms on the protein surfaces. The interacting probabilities were derived from the interior of known structures. Machine learning algorithms were applied to learn the characteristic patterns of the probability density maps specific to the PPI sites. The trained predictors for PPI sites were cross-validated with the training cases (consisting of 432 proteins) and were tested on an independent dataset (consisting of 142 proteins). The residue-based Matthews correlation coefficient for the independent test set was 0.423; the accuracy, precision, sensitivity, specificity were 0.753, 0.519, 0.677, and 0.779 respectively. The benchmark results indicate that the optimized machine learning models are among the best predictors in identifying PPI sites on protein surfaces. In particular, the PPI site prediction accuracy increases with increasing size of the PPI site and with increasing hydrophobicity in amino acid composition of the PPI interface; the core interface regions are more likely to be recognized with high prediction confidence. The results indicate that the physicochemical complementarity patterns on protein surfaces are important determinants in PPIs, and a substantial portion of the PPI sites can be predicted correctly with
Predicting and validating protein interactions using network structure.

Directory of Open Access Journals (Sweden)

Pao-Yang Chen

2008-07-01

Full Text Available Protein interactions play a vital part in the function of a cell. As experimental techniques for detection and validation of protein interactions are time consuming, there is a need for computational methods for this task. Protein interactions appear to form a network with a relatively high degree of local clustering. In this paper we exploit this clustering by suggesting a score based on triplets of observed protein interactions. The score utilises both protein characteristics and network properties. Our score based on triplets is shown to complement existing techniques for predicting protein interactions, outperforming them on data sets which display a high degree of clustering. The predicted interactions score highly against test measures for accuracy. Compared to a similar score derived from pairwise interactions only, the triplet score displays higher sensitivity and specificity. By looking at specific examples, we show how an experimental set of interactions can be enriched and validated. As part of this work we also examine the effect of different prior databases upon the accuracy of prediction and find that the interactions from the same kingdom give better results than from across kingdoms, suggesting that there may be fundamental differences between the networks. These results all emphasize that network structure is important and helps in the accurate prediction of protein interactions. The protein interaction data set and the program used in our analysis, and a list of predictions and validations, are available at http://www.stats.ox.ac.uk/bioinfo/resources/PredictingInteractions.
Predicting protein-protein interactions from multimodal biological data sources via nonnegative matrix tri-factorization.

Science.gov (United States)

Wang, Hua; Huang, Heng; Ding, Chris; Nie, Feiping

2013-04-01

Protein interactions are central to all the biological processes and structural scaffolds in living organisms, because they orchestrate a number of cellular processes such as metabolic pathways and immunological recognition. Several high-throughput methods, for example, yeast two-hybrid system and mass spectrometry method, can help determine protein interactions, which, however, suffer from high false-positive rates. Moreover, many protein interactions predicted by one method are not supported by another. Therefore, computational methods are necessary and crucial to complete the interactome expeditiously. In this work, we formulate the problem of predicting protein interactions from a new mathematical perspective--sparse matrix completion, and propose a novel nonnegative matrix factorization (NMF)-based matrix completion approach to predict new protein interactions from existing protein interaction networks. Through using manifold regularization, we further develop our method to integrate different biological data sources, such as protein sequences, gene expressions, protein structure information, etc. Extensive experimental results on four species, Saccharomyces cerevisiae, Drosophila melanogaster, Homo sapiens, and Caenorhabditis elegans, have shown that our new methods outperform related state-of-the-art protein interaction prediction methods.
Sequence-based prediction of protein protein interaction using a deep-learning algorithm.

Science.gov (United States)

Sun, Tanlin; Zhou, Bo; Lai, Luhua; Pei, Jianfeng

2017-05-25

Protein-protein interactions (PPIs) are critical for many biological processes. It is therefore important to develop accurate high-throughput methods for identifying PPI to better understand protein function, disease occurrence, and therapy design. Though various computational methods for predicting PPI have been developed, their robustness for prediction with external datasets is unknown. Deep-learning algorithms have achieved successful results in diverse areas, but their effectiveness for PPI prediction has not been tested. We used a stacked autoencoder, a type of deep-learning algorithm, to study the sequence-based PPI prediction. The best model achieved an average accuracy of 97.19% with 10-fold cross-validation. The prediction accuracies for various external datasets ranged from 87.99% to 99.21%, which are superior to those achieved with previous methods. To our knowledge, this research is the first to apply a deep-learning algorithm to sequence-based PPI prediction, and the results demonstrate its potential in this field.
Prediction of protein–protein interactions: unifying evolution and structure at protein interfaces

International Nuclear Information System (INIS)

Tuncbag, Nurcan; Gursoy, Attila; Keskin, Ozlem

2011-01-01

The vast majority of the chores in the living cell involve protein–protein interactions. Providing details of protein interactions at the residue level and incorporating them into protein interaction networks are crucial toward the elucidation of a dynamic picture of cells. Despite the rapid increase in the number of structurally known protein complexes, we are still far away from a complete network. Given experimental limitations, computational modeling of protein interactions is a prerequisite to proceed on the way to complete structural networks. In this work, we focus on the question 'how do proteins interact?' rather than 'which proteins interact?' and we review structure-based protein–protein interaction prediction approaches. As a sample approach for modeling protein interactions, PRISM is detailed which combines structural similarity and evolutionary conservation in protein interfaces to infer structures of complexes in the protein interaction network. This will ultimately help us to understand the role of protein interfaces in predicting bound conformations

Construction of ontology augmented networks for protein complex prediction.

Science.gov (United States)

Zhang, Yijia; Lin, Hongfei; Yang, Zhihao; Wang, Jian

2013-01-01

Protein complexes are of great importance in understanding the principles of cellular organization and function. The increase in available protein-protein interaction data, gene ontology and other resources make it possible to develop computational methods for protein complex prediction. Most existing methods focus mainly on the topological structure of protein-protein interaction networks, and largely ignore the gene ontology annotation information. In this article, we constructed ontology augmented networks with protein-protein interaction data and gene ontology, which effectively unified the topological structure of protein-protein interaction networks and the similarity of gene ontology annotations into unified distance measures. After constructing ontology augmented networks, a novel method (clustering based on ontology augmented networks) was proposed to predict protein complexes, which was capable of taking into account the topological structure of the protein-protein interaction network, as well as the similarity of gene ontology annotations. Our method was applied to two different yeast protein-protein interaction datasets and predicted many well-known complexes. The experimental results showed that (i) ontology augmented networks and the unified distance measure can effectively combine the structure closeness and gene ontology annotation similarity; (ii) our method is valuable in predicting protein complexes and has higher F1 and accuracy compared to other competing methods.
Fontes de proteína e carboidratos para cães e gatos Protein and carbohydrate ingredients for dogs and cats

Directory of Open Access Journals (Sweden)

Aulus Cavalieri Carciofi

2008-07-01

Full Text Available O mercado de alimentos para cães e gatos absorve importante quantidade de proteínas e carboidratos, apesar disso poucos estudos existem sobre digestibilidade e energia metabolizável destes ingredientes. Arroz e milho têm sido considerados as melhores fontes de amido, mas demonstra-se que o sorgo é igualmente bem digerido por cães. Na interpretação dos estudos, deve-se distinguir os que empregaram farinhas ou amidos purificados dos que empregaram ingredientes moídos, como utilizado na fabricação de alimentos para animais de companhia. Além de sua digestibilidade e valor energético, amidos interferem na glicemia de cães, o que torna interessante se empregar, para animais em condições específicas, fontes de carboidrato que levem à menores respostas de glicose e insulina. Devido a elevada necessidade de proteína, ingredientes protéicos são importantes nas formulações. Proteínas de origem animal apresentam maior variação em composição química, qualidade e digestibilidade que as de origem vegetal. Farinhas de origem animal podem apresentar excesso de matéria mineral, limitando sua inclusão na fórmula, enquanto derivados protéicos vegetais apresentam diversos fatores anti-nutricionais que devem ser inativados durante seu processamento. Demonstra-se que proteínas vegetais apresentam boa digestibilidade e energia metabolizável para cães e gatos, sendo sua inclusão interessante para reduzir a matéria mineral da dieta, controlar o excesso de bases do alimento e manter adequada a digestibilidade do produto, neste sentido soja micronizada e o farelo de glúten de milho 60% se destacam em digestibilidade e teor de energia metabolizável. A farinha de vísceras de frango, dentre as proteínas de origem animal secas demonstra-se como a de melhor digestibilidade e energia metabolizável.Although Pet food companies use important amounts of protein and carbohydrate sources, little scientific information was published about
Roles for text mining in protein function prediction.

Science.gov (United States)

Verspoor, Karin M

2014-01-01

The Human Genome Project has provided science with a hugely valuable resource: the blueprints for life; the specification of all of the genes that make up a human. While the genes have all been identified and deciphered, it is proteins that are the workhorses of the human body: they are essential to virtually all cell functions and are the primary mechanism through which biological function is carried out. Hence in order to fully understand what happens at a molecular level in biological organisms, and eventually to enable development of treatments for diseases where some aspect of a biological system goes awry, we must understand the functions of proteins. However, experimental characterization of protein function cannot scale to the vast amount of DNA sequence data now available. Computational protein function prediction has therefore emerged as a problem at the forefront of modern biology (Radivojac et al., Nat Methods 10(13):221-227, 2013).Within the varied approaches to computational protein function prediction that have been explored, there are several that make use of biomedical literature mining. These methods take advantage of information in the published literature to associate specific proteins with specific protein functions. In this chapter, we introduce two main strategies for doing this: association of function terms, represented as Gene Ontology terms (Ashburner et al., Nat Genet 25(1):25-29, 2000), to proteins based on information in published articles, and a paradigm called LEAP-FS (Literature-Enhanced Automated Prediction of Functional Sites) in which literature mining is used to validate the predictions of an orthogonal computational protein function prediction method.
Modelos para estimar as exigências de energia metabolizável para poedeiras

OpenAIRE

Sakomura,Nilva Kazue; Basaglia,Roberta; Sá-Fortes,Cristina M. L.; Fernandes,João Batista K.

2005-01-01

Objetivou-se, com este trabalho, elaborar um modelo para estimar as exigências de energia metabolizável (EM) para poedeiras leves da linhagem Lohmann LSL, utilizando-se o método fatorial. Para determinar o efeito da temperatura sobre as exigências de EM para mantença, foram conduzidos experimentos em câmaras climáticas com temperaturas constantes de 12, 22 e 31ºC, utilizando a técnica do abate comparativo. A exigência de energia líquida para o ganho de peso foi determinada por meio da regress...
EVA: continuous automatic evaluation of protein structure prediction servers.

Science.gov (United States)

Eyrich, V A; Martí-Renom, M A; Przybylski, D; Madhusudhan, M S; Fiser, A; Pazos, F; Valencia, A; Sali, A; Rost, B

2001-12-01

Evaluation of protein structure prediction methods is difficult and time-consuming. Here, we describe EVA, a web server for assessing protein structure prediction methods, in an automated, continuous and large-scale fashion. Currently, EVA evaluates the performance of a variety of prediction methods available through the internet. Every week, the sequences of the latest experimentally determined protein structures are sent to prediction servers, results are collected, performance is evaluated, and a summary is published on the web. EVA has so far collected data for more than 3000 protein chains. These results may provide valuable insight to both developers and users of prediction methods. http://cubic.bioc.columbia.edu/eva. eva@cubic.bioc.columbia.edu
Response of finishing broiler chickens fed three energy/protein ...

African Journals Online (AJOL)

A feeding experiment was conducted to investigate the response of finishing broiler chicken to diets containing three metabolizable energy (ME)/crude protein (CP) combinations ( 3203.76 ME vs 19.90 %CP, 2884.15 ME vs 18.10%CP and 2566.42 ME vs 18.10 %CP) at fixed ME:CP ratio of 160:1. A total of 126 four weeks ...
Predicting DNA-binding proteins and binding residues by complex structure prediction and application to human proteome.

Directory of Open Access Journals (Sweden)

Huiying Zhao

Full Text Available As more and more protein sequences are uncovered from increasingly inexpensive sequencing techniques, an urgent task is to find their functions. This work presents a highly reliable computational technique for predicting DNA-binding function at the level of protein-DNA complex structures, rather than low-resolution two-state prediction of DNA-binding as most existing techniques do. The method first predicts protein-DNA complex structure by utilizing the template-based structure prediction technique HHblits, followed by binding affinity prediction based on a knowledge-based energy function (Distance-scaled finite ideal-gas reference state for protein-DNA interactions. A leave-one-out cross validation of the method based on 179 DNA-binding and 3797 non-binding protein domains achieves a Matthews correlation coefficient (MCC of 0.77 with high precision (94% and high sensitivity (65%. We further found 51% sensitivity for 82 newly determined structures of DNA-binding proteins and 56% sensitivity for the human proteome. In addition, the method provides a reasonably accurate prediction of DNA-binding residues in proteins based on predicted DNA-binding complex structures. Its application to human proteome leads to more than 300 novel DNA-binding proteins; some of these predicted structures were validated by known structures of homologous proteins in APO forms. The method [SPOT-Seq (DNA] is available as an on-line server at http://sparks-lab.org.
ProteinSplit: splitting of multi-domain proteins using prediction of ordered and disordered regions in protein sequences for virtual structural genomics

International Nuclear Information System (INIS)

Wyrwicz, Lucjan S; Koczyk, Grzegorz; Rychlewski, Leszek; Plewczynski, Dariusz

2007-01-01

The annotation of protein folds within newly sequenced genomes is the main target for semi-automated protein structure prediction (virtual structural genomics). A large number of automated methods have been developed recently with very good results in the case of single-domain proteins. Unfortunately, most of these automated methods often fail to properly predict the distant homology between a given multi-domain protein query and structural templates. Therefore a multi-domain protein should be split into domains in order to overcome this limitation. ProteinSplit is designed to identify protein domain boundaries using a novel algorithm that predicts disordered regions in protein sequences. The software utilizes various sequence characteristics to assess the local propensity of a protein to be disordered or ordered in terms of local structure stability. These disordered parts of a protein are likely to create interdomain spacers. Because of its speed and portability, the method was successfully applied to several genome-wide fold annotation experiments. The user can run an automated analysis of sets of proteins or perform semi-automated multiple user projects (saving the results on the server). Additionally the sequences of predicted domains can be sent to the Bioinfo.PL Protein Structure Prediction Meta-Server for further protein three-dimensional structure and function prediction. The program is freely accessible as a web service at http://lucjan.bioinfo.pl/proteinsplit together with detailed benchmark results on the critical assessment of a fully automated structure prediction (CAFASP) set of sequences. The source code of the local version of protein domain boundary prediction is available upon request from the authors
Models of protein and amino acid requirements for cattle

Directory of Open Access Journals (Sweden)

Luis Orlindo Tedeschi

2015-03-01

Full Text Available Protein supply and requirements by ruminants have been studied for more than a century. These studies led to the accumulation of lots of scientific information about digestion and metabolism of protein by ruminants as well as the characterization of the dietary protein in order to maximize animal performance. During the 1980s and 1990s, when computers became more accessible and powerful, scientists began to conceptualize and develop mathematical nutrition models, and to program them into computers to assist with ration balancing and formulation for domesticated ruminants, specifically dairy and beef cattle. The most commonly known nutrition models developed during this period were the National Research Council (NRC in the United States, Agricultural Research Council (ARC in the United Kingdom, Institut National de la Recherche Agronomique (INRA in France, and the Commonwealth Scientific and Industrial Research Organization (CSIRO in Australia. Others were derivative works from these models with different degrees of modifications in the supply or requirement calculations, and the modeling nature (e.g., static or dynamic, mechanistic, or deterministic. Circa 1990s, most models adopted the metabolizable protein (MP system over the crude protein (CP and digestible CP systems to estimate supply of MP and the factorial system to calculate MP required by the animal. The MP system included two portions of protein (i.e., the rumen-undegraded dietary CP - RUP - and the contributions of microbial CP - MCP as the main sources of MP for the animal. Some models would explicitly account for the impact of dry matter intake (DMI on the MP required for maintenance (MPm; e.g., Cornell Net Carbohydrate and Protein System - CNCPS, the Dutch system - DVE/OEB, while others would simply account for scurf, urinary, metabolic fecal, and endogenous contributions independently of DMI. All models included milk yield and its components in estimating MP required for lactation
Beef Species Symposium: an assessment of the 1996 Beef NRC: metabolizable protein supply and demand and effectiveness of model performance prediction of beef females within extensive grazing systems.

Science.gov (United States)

Waterman, R C; Caton, J S; Löest, C A; Petersen, M K; Roberts, A J

2014-07-01

Interannual variation of forage quantity and quality driven by precipitation events influence beef livestock production systems within the Southern and Northern Plains and Pacific West, which combined represent 60% (approximately 17.5 million) of the total beef cows in the United States. The beef cattle requirements published by the NRC are an important tool and excellent resource for both professionals and producers to use when implementing feeding practices and nutritional programs within the various production systems. The objectives of this paper include evaluation of the 1996 Beef NRC model in terms of effectiveness in predicting extensive range beef cow performance within arid and semiarid environments using available data sets, identifying model inefficiencies that could be refined to improve the precision of predicting protein supply and demand for range beef cows, and last, providing recommendations for future areas of research. An important addition to the current Beef NRC model would be to allow users to provide region-specific forage characteristics and the ability to describe supplement composition, amount, and delivery frequency. Beef NRC models would then need to be modified to account for the N recycling that occurs throughout a supplementation interval and the impact that this would have on microbial efficiency and microbial protein supply. The Beef NRC should also consider the role of ruminal and postruminal supply and demand of specific limiting AA. Additional considerations should include the partitioning effects of nitrogenous compounds under different physiological production stages (e.g., lactation, pregnancy, and periods of BW loss). The intent of information provided is to aid revision of the Beef NRC by providing supporting material for changes and identifying gaps in existing scientific literature where future research is needed to enhance the predictive precision and application of the Beef NRC models.
PSPP: a protein structure prediction pipeline for computing clusters.

Directory of Open Access Journals (Sweden)

Michael S Lee

2009-07-01

Full Text Available Protein structures are critical for understanding the mechanisms of biological systems and, subsequently, for drug and vaccine design. Unfortunately, protein sequence data exceed structural data by a factor of more than 200 to 1. This gap can be partially filled by using computational protein structure prediction. While structure prediction Web servers are a notable option, they often restrict the number of sequence queries and/or provide a limited set of prediction methodologies. Therefore, we present a standalone protein structure prediction software package suitable for high-throughput structural genomic applications that performs all three classes of prediction methodologies: comparative modeling, fold recognition, and ab initio. This software can be deployed on a user's own high-performance computing cluster.The pipeline consists of a Perl core that integrates more than 20 individual software packages and databases, most of which are freely available from other research laboratories. The query protein sequences are first divided into domains either by domain boundary recognition or Bayesian statistics. The structures of the individual domains are then predicted using template-based modeling or ab initio modeling. The predicted models are scored with a statistical potential and an all-atom force field. The top-scoring ab initio models are annotated by structural comparison against the Structural Classification of Proteins (SCOP fold database. Furthermore, secondary structure, solvent accessibility, transmembrane helices, and structural disorder are predicted. The results are generated in text, tab-delimited, and hypertext markup language (HTML formats. So far, the pipeline has been used to study viral and bacterial proteomes.The standalone pipeline that we introduce here, unlike protein structure prediction Web servers, allows users to devote their own computing assets to process a potentially unlimited number of queries as well as perform
Blind Test of Physics-Based Prediction of Protein Structures

Science.gov (United States)

Shell, M. Scott; Ozkan, S. Banu; Voelz, Vincent; Wu, Guohong Albert; Dill, Ken A.

2009-01-01

We report here a multiprotein blind test of a computer method to predict native protein structures based solely on an all-atom physics-based force field. We use the AMBER 96 potential function with an implicit (GB/SA) model of solvation, combined with replica-exchange molecular-dynamics simulations. Coarse conformational sampling is performed using the zipping and assembly method (ZAM), an approach that is designed to mimic the putative physical routes of protein folding. ZAM was applied to the folding of six proteins, from 76 to 112 monomers in length, in CASP7, a community-wide blind test of protein structure prediction. Because these predictions have about the same level of accuracy as typical bioinformatics methods, and do not utilize information from databases of known native structures, this work opens up the possibility of predicting the structures of membrane proteins, synthetic peptides, or other foldable polymers, for which there is little prior knowledge of native structures. This approach may also be useful for predicting physical protein folding routes, non-native conformations, and other physical properties from amino acid sequences. PMID:19186130
CNNcon: improved protein contact maps prediction using cascaded neural networks.

Directory of Open Access Journals (Sweden)

Wang Ding

Full Text Available BACKGROUNDS: Despite continuing progress in X-ray crystallography and high-field NMR spectroscopy for determination of three-dimensional protein structures, the number of unsolved and newly discovered sequences grows much faster than that of determined structures. Protein modeling methods can possibly bridge this huge sequence-structure gap with the development of computational science. A grand challenging problem is to predict three-dimensional protein structure from its primary structure (residues sequence alone. However, predicting residue contact maps is a crucial and promising intermediate step towards final three-dimensional structure prediction. Better predictions of local and non-local contacts between residues can transform protein sequence alignment to structure alignment, which can finally improve template based three-dimensional protein structure predictors greatly. METHODS: CNNcon, an improved multiple neural networks based contact map predictor using six sub-networks and one final cascade-network, was developed in this paper. Both the sub-networks and the final cascade-network were trained and tested with their corresponding data sets. While for testing, the target protein was first coded and then input to its corresponding sub-networks for prediction. After that, the intermediate results were input to the cascade-network to finish the final prediction. RESULTS: The CNNcon can accurately predict 58.86% in average of contacts at a distance cutoff of 8 Å for proteins with lengths ranging from 51 to 450. The comparison results show that the present method performs better than the compared state-of-the-art predictors. Particularly, the prediction accuracy keeps steady with the increase of protein sequence length. It indicates that the CNNcon overcomes the thin density problem, with which other current predictors have trouble. This advantage makes the method valuable to the prediction of long length proteins. As a result, the effective
On the analysis of protein-protein interactions via knowledge-based potentials for the prediction of protein-protein docking

DEFF Research Database (Denmark)

Feliu, Elisenda; Aloy, Patrick; Oliva, Baldo

2011-01-01

Development of effective methods to screen binary interactions obtained by rigid-body protein-protein docking is key for structure prediction of complexes and for elucidating physicochemical principles of protein-protein binding. We have derived empirical knowledge-based potential functions for s...... and with independence of the partner. This information is encoded at the residue level and could be easily incorporated in the initial grid scoring for Fast Fourier Transform rigid-body docking methods.......Development of effective methods to screen binary interactions obtained by rigid-body protein-protein docking is key for structure prediction of complexes and for elucidating physicochemical principles of protein-protein binding. We have derived empirical knowledge-based potential functions...... for selecting rigid-body docking poses. These potentials include the energetic component that provides the residues with a particular secondary structure and surface accessibility. These scoring functions have been tested on a state-of-art benchmark dataset and on a decoy dataset of permanent interactions. Our...
Protein function prediction involved on radio-resistant bacteria

International Nuclear Information System (INIS)

Mezhoud, Karim; Mankai, Houda; Sghaier, Haitham; Barkallah, Insaf

2009-01-01

Previously, we identified 58 proteins under positive selection in ionizing-radiation-resistant bacteria (IRRB) but absent in all ionizing-radiation-sensitive bacteria (IRSB). These are good reasons to believe these 58 proteins with their interactions with other proteins (interactomes) are a part of the answer to the question as to how IRRB resist to radiation, because our knowledge of interactomes of positively selected orphan proteins in IRRB might allow us to define cellular pathways important to ionizing-radiation resistance. Using the Database of Interacting Proteins and the PSIbase, we have predicted interactions of orthologs of the 58 proteins under positive selection in IRRB but absent in all IRSB. We used integrate experimental data sets with molecular interaction networks and protein structure prediction from databases. Among these, 18 proteins with their interactomes were identified in Deinococcus radiodurans R1. DNA checkpoint and repair, kinases pathways, energetic and nucleotide metabolisms were the important biological process that found. We predicted the interactomes of 58 proteins under positive selection in IRRB. It is hoped our data will provide new clues as to the cellular pathways that are important for ionizing-radiation resistance. We have identified news proteins involved on DNA management which were not previously mentioned. It is an important input in addition to protein that studied. It does still work to deepen our study on these new proteins
Comprehensive predictions of target proteins based on protein-chemical interaction using virtual screening and experimental verifications.

Science.gov (United States)

Kobayashi, Hiroki; Harada, Hiroko; Nakamura, Masaomi; Futamura, Yushi; Ito, Akihiro; Yoshida, Minoru; Iemura, Shun-Ichiro; Shin-Ya, Kazuo; Doi, Takayuki; Takahashi, Takashi; Natsume, Tohru; Imoto, Masaya; Sakakibara, Yasubumi

2012-04-05

Identification of the target proteins of bioactive compounds is critical for elucidating the mode of action; however, target identification has been difficult in general, mostly due to the low sensitivity of detection using affinity chromatography followed by CBB staining and MS/MS analysis. We applied our protocol of predicting target proteins combining in silico screening and experimental verification for incednine, which inhibits the anti-apoptotic function of Bcl-xL by an unknown mechanism. One hundred eighty-two target protein candidates were computationally predicted to bind to incednine by the statistical prediction method, and the predictions were verified by in vitro binding of incednine to seven proteins, whose expression can be confirmed in our cell system.As a result, 40% accuracy of the computational predictions was achieved successfully, and we newly found 3 incednine-binding proteins. This study revealed that our proposed protocol of predicting target protein combining in silico screening and experimental verification is useful, and provides new insight into a strategy for identifying target proteins of small molecules.
Comprehensive predictions of target proteins based on protein-chemical interaction using virtual screening and experimental verifications

Directory of Open Access Journals (Sweden)

Kobayashi Hiroki

2012-04-01

Full Text Available Abstract Background Identification of the target proteins of bioactive compounds is critical for elucidating the mode of action; however, target identification has been difficult in general, mostly due to the low sensitivity of detection using affinity chromatography followed by CBB staining and MS/MS analysis. Results We applied our protocol of predicting target proteins combining in silico screening and experimental verification for incednine, which inhibits the anti-apoptotic function of Bcl-xL by an unknown mechanism. One hundred eighty-two target protein candidates were computationally predicted to bind to incednine by the statistical prediction method, and the predictions were verified by in vitro binding of incednine to seven proteins, whose expression can be confirmed in our cell system. As a result, 40% accuracy of the computational predictions was achieved successfully, and we newly found 3 incednine-binding proteins. Conclusions This study revealed that our proposed protocol of predicting target protein combining in silico screening and experimental verification is useful, and provides new insight into a strategy for identifying target proteins of small molecules.
Efficient prediction of human protein-protein interactions at a global scale.

Science.gov (United States)

Schoenrock, Andrew; Samanfar, Bahram; Pitre, Sylvain; Hooshyar, Mohsen; Jin, Ke; Phillips, Charles A; Wang, Hui; Phanse, Sadhna; Omidi, Katayoun; Gui, Yuan; Alamgir, Md; Wong, Alex; Barrenäs, Fredrik; Babu, Mohan; Benson, Mikael; Langston, Michael A; Green, James R; Dehne, Frank; Golshani, Ashkan

2014-12-10

Our knowledge of global protein-protein interaction (PPI) networks in complex organisms such as humans is hindered by technical limitations of current methods. On the basis of short co-occurring polypeptide regions, we developed a tool called MP-PIPE capable of predicting a global human PPI network within 3 months. With a recall of 23% at a precision of 82.1%, we predicted 172,132 putative PPIs. We demonstrate the usefulness of these predictions through a range of experiments. The speed and accuracy associated with MP-PIPE can make this a potential tool to study individual human PPI networks (from genomic sequences alone) for personalized medicine.
Predicting protein-binding RNA nucleotides with consideration of binding partners.

Science.gov (United States)

Tuvshinjargal, Narankhuu; Lee, Wook; Park, Byungkyu; Han, Kyungsook

2015-06-01

In recent years several computational methods have been developed to predict RNA-binding sites in protein. Most of these methods do not consider interacting partners of a protein, so they predict the same RNA-binding sites for a given protein sequence even if the protein binds to different RNAs. Unlike the problem of predicting RNA-binding sites in protein, the problem of predicting protein-binding sites in RNA has received little attention mainly because it is much more difficult and shows a lower accuracy on average. In our previous study, we developed a method that predicts protein-binding nucleotides from an RNA sequence. In an effort to improve the prediction accuracy and usefulness of the previous method, we developed a new method that uses both RNA and protein sequence data. In this study, we identified effective features of RNA and protein molecules and developed a new support vector machine (SVM) model to predict protein-binding nucleotides from RNA and protein sequence data. The new model that used both protein and RNA sequence data achieved a sensitivity of 86.5%, a specificity of 86.2%, a positive predictive value (PPV) of 72.6%, a negative predictive value (NPV) of 93.8% and Matthews correlation coefficient (MCC) of 0.69 in a 10-fold cross validation; it achieved a sensitivity of 58.8%, a specificity of 87.4%, a PPV of 65.1%, a NPV of 84.2% and MCC of 0.48 in independent testing. For comparative purpose, we built another prediction model that used RNA sequence data alone and ran it on the same dataset. In a 10 fold-cross validation it achieved a sensitivity of 85.7%, a specificity of 80.5%, a PPV of 67.7%, a NPV of 92.2% and MCC of 0.63; in independent testing it achieved a sensitivity of 67.7%, a specificity of 78.8%, a PPV of 57.6%, a NPV of 85.2% and MCC of 0.45. In both cross-validations and independent testing, the new model that used both RNA and protein sequences showed a better performance than the model that used RNA sequence data alone in
Effect of increasing levels of apparent metabolizable energy on laying hens in barn system.

Science.gov (United States)

Kang, Hwan Ku; Park, Seong Bok; Jeon, Jin Joo; Kim, Hyun Soo; Park, Ki Tae; Kim, Sang Ho; Hong, Eui Chul; Kim, Chan Ho

2018-04-12

This experiment was to investigate the effect of increasing levels of apparent metabolizable energy (AMEn) on the laying performance, egg quality, blood parameter, blood biochemistry, intestinal morphology, and apparent total tract digestibility (ATTD) of energy and nutrients in diets fed to laying hens. A total of three-hundred twenty 33-week-old Hy-Line Brown laying hens (Gallus domesticus) were evenly assigned to four experimental diets of 2,750, 2,850, 2,950, and 3,050 kcal AMEn/kg in floor with deep litter of rice hulls. There were four replicates of each treatment, each consisting of 20 birds in a pen. AMEn intake was increased (linear, p Feed intake and feed conversion ratio were improved (linear, p hen-day egg production tended to be increased as increasing level of AMEn in diets increased. During the experiment, leukocyte concentration and blood biochemistry (total cholesterol, triglyceride, glucose, total protein, calcium, asparate aminotransferase (AST), and alanine transferase (ALT) were not influenced by increasing level of AMEn in diets. Gross energy and ether extract were increased (linear, p hens fed high AMEn diet (i.e., 3,050 kcal/kg in the current experiment) tended to overconsume energy with a positive effect on feed intake, feed conversion ratio, nutrient digestibility, and intestinal morphology but not in egg production and egg mass.

Improving protein function prediction methods with integrated literature data

Directory of Open Access Journals (Sweden)

Gabow Aaron P

2008-04-01

Full Text Available Abstract Background Determining the function of uncharacterized proteins is a major challenge in the post-genomic era due to the problem's complexity and scale. Identifying a protein's function contributes to an understanding of its role in the involved pathways, its suitability as a drug target, and its potential for protein modifications. Several graph-theoretic approaches predict unidentified functions of proteins by using the functional annotations of better-characterized proteins in protein-protein interaction networks. We systematically consider the use of literature co-occurrence data, introduce a new method for quantifying the reliability of co-occurrence and test how performance differs across species. We also quantify changes in performance as the prediction algorithms annotate with increased specificity. Results We find that including information on the co-occurrence of proteins within an abstract greatly boosts performance in the Functional Flow graph-theoretic function prediction algorithm in yeast, fly and worm. This increase in performance is not simply due to the presence of additional edges since supplementing protein-protein interactions with co-occurrence data outperforms supplementing with a comparably-sized genetic interaction dataset. Through the combination of protein-protein interactions and co-occurrence data, the neighborhood around unknown proteins is quickly connected to well-characterized nodes which global prediction algorithms can exploit. Our method for quantifying co-occurrence reliability shows superior performance to the other methods, particularly at threshold values around 10% which yield the best trade off between coverage and accuracy. In contrast, the traditional way of asserting co-occurrence when at least one abstract mentions both proteins proves to be the worst method for generating co-occurrence data, introducing too many false positives. Annotating the functions with greater specificity is harder
Prediction of protein-protein interaction sites in sequences and 3D structures by random forests.

Directory of Open Access Journals (Sweden)

Mile Sikić

2009-01-01

Full Text Available Identifying interaction sites in proteins provides important clues to the function of a protein and is becoming increasingly relevant in topics such as systems biology and drug discovery. Although there are numerous papers on the prediction of interaction sites using information derived from structure, there are only a few case reports on the prediction of interaction residues based solely on protein sequence. Here, a sliding window approach is combined with the Random Forests method to predict protein interaction sites using (i a combination of sequence- and structure-derived parameters and (ii sequence information alone. For sequence-based prediction we achieved a precision of 84% with a 26% recall and an F-measure of 40%. When combined with structural information, the prediction performance increases to a precision of 76% and a recall of 38% with an F-measure of 51%. We also present an attempt to rationalize the sliding window size and demonstrate that a nine-residue window is the most suitable for predictor construction. Finally, we demonstrate the applicability of our prediction methods by modeling the Ras-Raf complex using predicted interaction sites as target binding interfaces. Our results suggest that it is possible to predict protein interaction sites with quite a high accuracy using only sequence information.
Prediction of protein loop geometries in solution

NARCIS (Netherlands)

Rapp, Chaya S.; Strauss, Temima; Nederveen, Aart; Fuentes, Gloria

2007-01-01

The ability to determine the structure of a protein in solution is a critical tool for structural biology, as proteins in their native state are found in aqueous environments. Using a physical chemistry based prediction protocol, we demonstrate the ability to reproduce protein loop geometries in
Dry matter digestibility and metabolizable energy of crude glycerines originated from palm oil using fed rooster assay

Directory of Open Access Journals (Sweden)

Astiari Tia Legawa

2017-07-01

Full Text Available A study was conducted to determine the dry matter digestibility, gross energy (GE, the nitrogen-corrected apparent metabolizable energy (AMEn, and the nitrogen-corrected true metabolizable energy (TMEn of two crude glycerine from two different sources. The first crude glycerine (CG1 was from a large scale biodiesel producer with high content of glycerol (89.49% and low content of crude fat (1.73%, meanwhile the second crude glycerine (CG2 was from a medium scale biodiesel producer with lower content of glycerol than CG1 (38.36% and high content of crude fat (23.63%. Fed rooster assay based on Sibbald (1976 was used in the experiment. The experimental feed consisted of ground corn and three levels of crude glycerine (0, 10, and 20%. Twenty four Hisex brown roosters were housed in metabolic cages. Roosters were force fed with 30 g experimental feed, after 24 hours of fasting. Excreta collection was performed for two days while the roosters were fasting again. The content values of GE, AMEn, and TMEn of CG1 were 4065.18, 2926.59, and 3068.73 kcal kg-1 and for CG2 were 5928.09, 4010.11, and 4054.52 kcal kg, respectively.
Nutritional value and metabolizable energy of wheat by‑products used for feeding growing pigs

OpenAIRE

Wesendonck, Willian Rui; Kessler, Alexandre de Mello; Ribeiro, Andrea Machado Leal; Somensi, Marcelo Luiz; Bockor, Luciane; Dadalt, Julio Cezar; Monteiro, Alessandra Nardina Trícia Rigo; Marx, Fábio Ritter

2013-01-01

O objetivo deste trabalho foi avaliar o valor nutricional e energético de subprodutos do trigo, em dietas para suínos em crescimento, e obter equações de predição da energia metabolizável. Foram utilizados 36 suínos machos, castrados, alojados em gaiolas metabólicas individuais. Realizou-se a coleta total de fezes e urina em dois períodos de dez dias: cinco para adaptação e cinco para coleta. Utilizou-se o delineamento de blocos ao acaso, tendo-se considerado o período de coleta como bloco, c...
An overview of the prediction of protein DNA-binding sites.

Science.gov (United States)

Si, Jingna; Zhao, Rui; Wu, Rongling

2015-03-06

Interactions between proteins and DNA play an important role in many essential biological processes such as DNA replication, transcription, splicing, and repair. The identification of amino acid residues involved in DNA-binding sites is critical for understanding the mechanism of these biological activities. In the last decade, numerous computational approaches have been developed to predict protein DNA-binding sites based on protein sequence and/or structural information, which play an important role in complementing experimental strategies. At this time, approaches can be divided into three categories: sequence-based DNA-binding site prediction, structure-based DNA-binding site prediction, and homology modeling and threading. In this article, we review existing research on computational methods to predict protein DNA-binding sites, which includes data sets, various residue sequence/structural features, machine learning methods for comparison and selection, evaluation methods, performance comparison of different tools, and future directions in protein DNA-binding site prediction. In particular, we detail the meta-analysis of protein DNA-binding sites. We also propose specific implications that are likely to result in novel prediction methods, increased performance, or practical applications.
Mapping monomeric threading to protein-protein structure prediction.

Science.gov (United States)

Guerler, Aysam; Govindarajoo, Brandon; Zhang, Yang

2013-03-25

The key step of template-based protein-protein structure prediction is the recognition of complexes from experimental structure libraries that have similar quaternary fold. Maintaining two monomer and dimer structure libraries is however laborious, and inappropriate library construction can degrade template recognition coverage. We propose a novel strategy SPRING to identify complexes by mapping monomeric threading alignments to protein-protein interactions based on the original oligomer entries in the PDB, which does not rely on library construction and increases the efficiency and quality of complex template recognitions. SPRING is tested on 1838 nonhomologous protein complexes which can recognize correct quaternary template structures with a TM score >0.5 in 1115 cases after excluding homologous proteins. The average TM score of the first model is 60% and 17% higher than that by HHsearch and COTH, respectively, while the number of targets with an interface RMSD benchmark proteins. Although the relative performance of SPRING and ZDOCK depends on the level of homology filters, a combination of the two methods can result in a significantly higher model quality than ZDOCK at all homology thresholds. These data demonstrate a new efficient approach to quaternary structure recognition that is ready to use for genome-scale modeling of protein-protein interactions due to the high speed and accuracy.
C-reactive protein, fibrinogen, and cardiovascular disease prediction

DEFF Research Database (Denmark)

Kaptoge, Stephen; Di Angelantonio, Emanuele; Pennells, Lisa

2012-01-01

There is debate about the value of assessing levels of C-reactive protein (CRP) and other biomarkers of inflammation for the prediction of first cardiovascular events.......There is debate about the value of assessing levels of C-reactive protein (CRP) and other biomarkers of inflammation for the prediction of first cardiovascular events....
From nonspecific DNA-protein encounter complexes to the prediction of DNA-protein interactions.

Directory of Open Access Journals (Sweden)

Mu Gao

2009-03-01

Full Text Available DNA-protein interactions are involved in many essential biological activities. Because there is no simple mapping code between DNA base pairs and protein amino acids, the prediction of DNA-protein interactions is a challenging problem. Here, we present a novel computational approach for predicting DNA-binding protein residues and DNA-protein interaction modes without knowing its specific DNA target sequence. Given the structure of a DNA-binding protein, the method first generates an ensemble of complex structures obtained by rigid-body docking with a nonspecific canonical B-DNA. Representative models are subsequently selected through clustering and ranking by their DNA-protein interfacial energy. Analysis of these encounter complex models suggests that the recognition sites for specific DNA binding are usually favorable interaction sites for the nonspecific DNA probe and that nonspecific DNA-protein interaction modes exhibit some similarity to specific DNA-protein binding modes. Although the method requires as input the knowledge that the protein binds DNA, in benchmark tests, it achieves better performance in identifying DNA-binding sites than three previously established methods, which are based on sophisticated machine-learning techniques. We further apply our method to protein structures predicted through modeling and demonstrate that our method performs satisfactorily on protein models whose root-mean-square Calpha deviation from native is up to 5 A from their native structures. This study provides valuable structural insights into how a specific DNA-binding protein interacts with a nonspecific DNA sequence. The similarity between the specific DNA-protein interaction mode and nonspecific interaction modes may reflect an important sampling step in search of its specific DNA targets by a DNA-binding protein.
Casein infusion rate influences feed intake differently depending on metabolizable protein balance in dairy cows: A multilevel meta-analysis.

Science.gov (United States)

Martineau, R; Ouellet, D R; Kebreab, E; Lapierre, H

2016-04-01

The effects of casein infusion have been investigated extensively in ruminant species. Its effect on responses in dry matter intake (DMI) has been reviewed and indicated no significant effect. The literature reviewed in the current meta-analysis is more extensive and limited to dairy cows fed ad libitum. A total of 51 studies were included in the meta-analysis and data were fitted to a multilevel model adjusting for the correlated nature of some studies. The effect size was the mean difference calculated by subtracting the means for the control from the casein-infused group. Overall, casein infusion [average of 333 g of dry matter (DM)/d; range: 91 to 1,092 g of DM/d] tended to increase responses in DMI by 0.18 kg/d (n=48 studies; 3 outliers). However, an interaction was observed between the casein infusion rate (IR) and the initial metabolizable protein (MP) balance [i.e., supply minus requirements (NRC, 2001)]. When control cows were in negative MP balance (n=27 studies), responses in DMI averaged 0.28 kg/d at mean MP balance (-264 g/d) and casein IR (336 g/d), and a 100g/d increment in the casein IR from its mean increased further responses by 0.14 kg/d (MP balance being constant), compared with cows not infused with casein. In contrast, when control cows were in positive MP balance (n=22 studies; 2 outliers), responses in DMI averaged -0.20 kg/d at mean casein IR (339 g/d), and a 100g/d increment in the casein IR from its mean further decreased responses by 0.33 kg/d, compared with cows not infused with casein. Responses in milk true protein yield at mean casein IR were greater (109 vs. 65 g/d) for cows in negative vs. positive MP balance, respectively, and the influence of the casein IR on responses was significant only for cows in negative MP balance. A 100g/d increment in the casein IR from its mean increased further responses in milk true protein yield by 25 g/d, compared with cows not infused with casein. Responses in blood urea concentration increased in
Stringent homology-based prediction of H. sapiens-M. tuberculosis H37Rv protein-protein interactions.

Science.gov (United States)

Zhou, Hufeng; Gao, Shangzhi; Nguyen, Nam Ninh; Fan, Mengyuan; Jin, Jingjing; Liu, Bing; Zhao, Liang; Xiong, Geng; Tan, Min; Li, Shijun; Wong, Limsoon

2014-04-08

H. sapiens-M. tuberculosis H37Rv protein-protein interaction (PPI) data are essential for understanding the infection mechanism of the formidable pathogen M. tuberculosis H37Rv. Computational prediction is an important strategy to fill the gap in experimental H. sapiens-M. tuberculosis H37Rv PPI data. Homology-based prediction is frequently used in predicting both intra-species and inter-species PPIs. However, some limitations are not properly resolved in several published works that predict eukaryote-prokaryote inter-species PPIs using intra-species template PPIs. We develop a stringent homology-based prediction approach by taking into account (i) differences between eukaryotic and prokaryotic proteins and (ii) differences between inter-species and intra-species PPI interfaces. We compare our stringent homology-based approach to a conventional homology-based approach for predicting host-pathogen PPIs, based on cellular compartment distribution analysis, disease gene list enrichment analysis, pathway enrichment analysis and functional category enrichment analysis. These analyses support the validity of our prediction result, and clearly show that our approach has better performance in predicting H. sapiens-M. tuberculosis H37Rv PPIs. Using our stringent homology-based approach, we have predicted a set of highly plausible H. sapiens-M. tuberculosis H37Rv PPIs which might be useful for many of related studies. Based on our analysis of the H. sapiens-M. tuberculosis H37Rv PPI network predicted by our stringent homology-based approach, we have discovered several interesting properties which are reported here for the first time. We find that both host proteins and pathogen proteins involved in the host-pathogen PPIs tend to be hubs in their own intra-species PPI network. Also, both host and pathogen proteins involved in host-pathogen PPIs tend to have longer primary sequence, tend to have more domains, tend to be more hydrophilic, etc. And the protein domains from both
PIPE: a protein-protein interaction prediction engine based on the re-occurring short polypeptide sequences between known interacting protein pairs

Directory of Open Access Journals (Sweden)

Greenblatt Jack

2006-07-01

Full Text Available Abstract Background Identification of protein interaction networks has received considerable attention in the post-genomic era. The currently available biochemical approaches used to detect protein-protein interactions are all time and labour intensive. Consequently there is a growing need for the development of computational tools that are capable of effectively identifying such interactions. Results Here we explain the development and implementation of a novel Protein-Protein Interaction Prediction Engine termed PIPE. This tool is capable of predicting protein-protein interactions for any target pair of the yeast Saccharomyces cerevisiae proteins from their primary structure and without the need for any additional information or predictions about the proteins. PIPE showed a sensitivity of 61% for detecting any yeast protein interaction with 89% specificity and an overall accuracy of 75%. This rate of success is comparable to those associated with the most commonly used biochemical techniques. Using PIPE, we identified a novel interaction between YGL227W (vid30 and YMR135C (gid8 yeast proteins. This lead us to the identification of a novel yeast complex that here we term vid30 complex (vid30c. The observed interaction was confirmed by tandem affinity purification (TAP tag, verifying the ability of PIPE to predict novel protein-protein interactions. We then used PIPE analysis to investigate the internal architecture of vid30c. It appeared from PIPE analysis that vid30c may consist of a core and a secondary component. Generation of yeast gene deletion strains combined with TAP tagging analysis indicated that the deletion of a member of the core component interfered with the formation of vid30c, however, deletion of a member of the secondary component had little effect (if any on the formation of vid30c. Also, PIPE can be used to analyse yeast proteins for which TAP tagging fails, thereby allowing us to predict protein interactions that are not
Bioinformatic Prediction of WSSV-Host Protein-Protein Interaction

Directory of Open Access Journals (Sweden)

Zheng Sun

2014-01-01

Full Text Available WSSV is one of the most dangerous pathogens in shrimp aquaculture. However, the molecular mechanism of how WSSV interacts with shrimp is still not very clear. In the present study, bioinformatic approaches were used to predict interactions between proteins from WSSV and shrimp. The genome data of WSSV (NC_003225.1 and the constructed transcriptome data of F. chinensis were used to screen potentially interacting proteins by searching in protein interaction databases, including STRING, Reactome, and DIP. Forty-four pairs of proteins were suggested to have interactions between WSSV and the shrimp. Gene ontology analysis revealed that 6 pairs of these interacting proteins were classified into “extracellular region” or “receptor complex” GO-terms. KEGG pathway analysis showed that they were involved in the “ECM-receptor interaction pathway.” In the 6 pairs of interacting proteins, an envelope protein called “collagen-like protein” (WSSV-CLP encoded by an early virus gene “wsv001” in WSSV interacted with 6 deduced proteins from the shrimp, including three integrin alpha (ITGA, two integrin beta (ITGB, and one syndecan (SDC. Sequence analysis on WSSV-CLP, ITGA, ITGB, and SDC revealed that they possessed the sequence features for protein-protein interactions. This study might provide new insights into the interaction mechanisms between WSSV and shrimp.
Predicting co-complexed protein pairs using genomic and proteomic data integration

Directory of Open Access Journals (Sweden)

King Oliver D

2004-04-01

Full Text Available Abstract Background Identifying all protein-protein interactions in an organism is a major objective of proteomics. A related goal is to know which protein pairs are present in the same protein complex. High-throughput methods such as yeast two-hybrid (Y2H and affinity purification coupled with mass spectrometry (APMS have been used to detect interacting proteins on a genomic scale. However, both Y2H and APMS methods have substantial false-positive rates. Aside from high-throughput interaction screens, other gene- or protein-pair characteristics may also be informative of physical interaction. Therefore it is desirable to integrate multiple datasets and utilize their different predictive value for more accurate prediction of co-complexed relationship. Results Using a supervised machine learning approach – probabilistic decision tree, we integrated high-throughput protein interaction datasets and other gene- and protein-pair characteristics to predict co-complexed pairs (CCP of proteins. Our predictions proved more sensitive and specific than predictions based on Y2H or APMS methods alone or in combination. Among the top predictions not annotated as CCPs in our reference set (obtained from the MIPS complex catalogue, a significant fraction was found to physically interact according to a separate database (YPD, Yeast Proteome Database, and the remaining predictions may potentially represent unknown CCPs. Conclusions We demonstrated that the probabilistic decision tree approach can be successfully used to predict co-complexed protein (CCP pairs from other characteristics. Our top-scoring CCP predictions provide testable hypotheses for experimental validation.
Automatic selection of reference taxa for protein-protein interaction prediction with phylogenetic profiling

DEFF Research Database (Denmark)

Simonsen, Martin; Maetschke, S.R.; Ragan, M.A.

2012-01-01

Motivation: Phylogenetic profiling methods can achieve good accuracy in predicting protein–protein interactions, especially in prokaryotes. Recent studies have shown that the choice of reference taxa (RT) is critical for accurate prediction, but with more than 2500 fully sequenced taxa publicly......: We present three novel methods for automating the selection of RT, using machine learning based on known protein–protein interaction networks. One of these methods in particular, Tree-Based Search, yields greatly improved prediction accuracies. We further show that different methods for constituting...... phylogenetic profiles often require very different RT sets to support high prediction accuracy....
A computational tool to predict the evolutionarily conserved protein-protein interaction hot-spot residues from the structure of the unbound protein.

Science.gov (United States)

Agrawal, Neeraj J; Helk, Bernhard; Trout, Bernhardt L

2014-01-21

Identifying hot-spot residues - residues that are critical to protein-protein binding - can help to elucidate a protein's function and assist in designing therapeutic molecules to target those residues. We present a novel computational tool, termed spatial-interaction-map (SIM), to predict the hot-spot residues of an evolutionarily conserved protein-protein interaction from the structure of an unbound protein alone. SIM can predict the protein hot-spot residues with an accuracy of 36-57%. Thus, the SIM tool can be used to predict the yet unknown hot-spot residues for many proteins for which the structure of the protein-protein complexes are not available, thereby providing a clue to their functions and an opportunity to design therapeutic molecules to target these proteins. Copyright © 2013 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.
An Overview of the Prediction of Protein DNA-Binding Sites

Directory of Open Access Journals (Sweden)

Jingna Si

2015-03-01

Full Text Available Interactions between proteins and DNA play an important role in many essential biological processes such as DNA replication, transcription, splicing, and repair. The identification of amino acid residues involved in DNA-binding sites is critical for understanding the mechanism of these biological activities. In the last decade, numerous computational approaches have been developed to predict protein DNA-binding sites based on protein sequence and/or structural information, which play an important role in complementing experimental strategies. At this time, approaches can be divided into three categories: sequence-based DNA-binding site prediction, structure-based DNA-binding site prediction, and homology modeling and threading. In this article, we review existing research on computational methods to predict protein DNA-binding sites, which includes data sets, various residue sequence/structural features, machine learning methods for comparison and selection, evaluation methods, performance comparison of different tools, and future directions in protein DNA-binding site prediction. In particular, we detail the meta-analysis of protein DNA-binding sites. We also propose specific implications that are likely to result in novel prediction methods, increased performance, or practical applications.
Predicting protein-protein interactions in Arabidopsis thaliana through integration of orthology, gene ontology and co-expression

Directory of Open Access Journals (Sweden)

Vandepoele Klaas

2009-06-01

Full Text Available Abstract Background Large-scale identification of the interrelationships between different components of the cell, such as the interactions between proteins, has recently gained great interest. However, unraveling large-scale protein-protein interaction maps is laborious and expensive. Moreover, assessing the reliability of the interactions can be cumbersome. Results In this study, we have developed a computational method that exploits the existing knowledge on protein-protein interactions in diverse species through orthologous relations on the one hand, and functional association data on the other hand to predict and filter protein-protein interactions in Arabidopsis thaliana. A highly reliable set of protein-protein interactions is predicted through this integrative approach making use of existing protein-protein interaction data from yeast, human, C. elegans and D. melanogaster. Localization, biological process, and co-expression data are used as powerful indicators for protein-protein interactions. The functional repertoire of the identified interactome reveals interactions between proteins functioning in well-conserved as well as plant-specific biological processes. We observe that although common mechanisms (e.g. actin polymerization and components (e.g. ARPs, actin-related proteins exist between different lineages, they are active in specific processes such as growth, cancer metastasis and trichome development in yeast, human and Arabidopsis, respectively. Conclusion We conclude that the integration of orthology with functional association data is adequate to predict protein-protein interactions. Through this approach, a high number of novel protein-protein interactions with diverse biological roles is discovered. Overall, we have predicted a reliable set of protein-protein interactions suitable for further computational as well as experimental analyses.
Protein (multi-)location prediction: utilizing interdependencies via a generative model.

Science.gov (United States)

Simha, Ramanuja; Briesemeister, Sebastian; Kohlbacher, Oliver; Shatkay, Hagit

2015-06-15

Proteins are responsible for a multitude of vital tasks in all living organisms. Given that a protein's function and role are strongly related to its subcellular location, protein location prediction is an important research area. While proteins move from one location to another and can localize to multiple locations, most existing location prediction systems assign only a single location per protein. A few recent systems attempt to predict multiple locations for proteins, however, their performance leaves much room for improvement. Moreover, such systems do not capture dependencies among locations and usually consider locations as independent. We hypothesize that a multi-location predictor that captures location inter-dependencies can improve location predictions for proteins. We introduce a probabilistic generative model for protein localization, and develop a system based on it-which we call MDLoc-that utilizes inter-dependencies among locations to predict multiple locations for proteins. The model captures location inter-dependencies using Bayesian networks and represents dependency between features and locations using a mixture model. We use iterative processes for learning model parameters and for estimating protein locations. We evaluate our classifier MDLoc, on a dataset of single- and multi-localized proteins derived from the DBMLoc dataset, which is the most comprehensive protein multi-localization dataset currently available. Our results, obtained by using MDLoc, significantly improve upon results obtained by an initial simpler classifier, as well as on results reported by other top systems. MDLoc is available at: http://www.eecis.udel.edu/∼compbio/mdloc. © The Author 2015. Published by Oxford University Press.
Boosting compound-protein interaction prediction by deep learning.

Science.gov (United States)

Tian, Kai; Shao, Mingyu; Wang, Yang; Guan, Jihong; Zhou, Shuigeng

2016-11-01

The identification of interactions between compounds and proteins plays an important role in network pharmacology and drug discovery. However, experimentally identifying compound-protein interactions (CPIs) is generally expensive and time-consuming, computational approaches are thus introduced. Among these, machine-learning based methods have achieved a considerable success. However, due to the nonlinear and imbalanced nature of biological data, many machine learning approaches have their own limitations. Recently, deep learning techniques show advantages over many state-of-the-art machine learning methods in some applications. In this study, we aim at improving the performance of CPI prediction based on deep learning, and propose a method called DL-CPI (the abbreviation of Deep Learning for Compound-Protein Interactions prediction), which employs deep neural network (DNN) to effectively learn the representations of compound-protein pairs. Extensive experiments show that DL-CPI can learn useful features of compound-protein pairs by a layerwise abstraction, and thus achieves better prediction performance than existing methods on both balanced and imbalanced datasets. Copyright © 2016 Elsevier Inc. All rights reserved.

Comparing human-Salmonella with plant-Salmonella protein-protein interaction predictions

Directory of Open Access Journals (Sweden)

Sylvia eSchleker

2015-01-01

Full Text Available Salmonellosis is the most frequent food-borne disease world-wide and can be transmitted to humans by a variety of routes, especially via animal and plant products. Salmonella bacteria are believed to use not only animal and human but also plant hosts despite their evolutionary distance. This raises the question if Salmonella employs similar mechanisms in infection of these diverse hosts. Given that most of our understanding comes from its interaction with human hosts, we investigate here to what degree knowledge of Salmonella-human interactions can be transferred to the Salmonella-plant system. Reviewed are recent publications on analysis and prediction of Salmonella-host interactomes. Putative protein-protein interactions (PPIs between Salmonella and its human and Arabidopsis hosts were retrieved utilizing purely interolog-based approaches in which predictions were inferred based on available sequence and domain information of known PPIs, and machine learning approaches that integrate a larger set of useful information from different sources. Transfer learning is an especially suitable machine learning technique to predict plant host targets from the knowledge of human host targets. A comparison of the prediction results with transcriptomic data shows a clear overlap between the host proteins predicted to be targeted by PPIs and their gene ontology enrichment in both host species and regulation of gene expression. In particular, the cellular processes Salmonella interferes with in plants and humans are catabolic processes. The details of how these processes are targeted, however, are quite different between the two organisms, as expected based on their evolutionary and habitat differences. Possible implications of this observation on evolution of host-pathogen communication are discussed.
Prediction of protein hydration sites from sequence by modular neural networks

DEFF Research Database (Denmark)

Ehrlich, L.; Reczko, M.; Bohr, Henrik

1998-01-01

The hydration properties of a protein are important determinants of its structure and function. Here, modular neural networks are employed to predict ordered hydration sites using protein sequence information. First, secondary structure and solvent accessibility are predicted from sequence with two...... separate neural networks. These predictions are used as input together with protein sequences for networks predicting hydration of residues, backbone atoms and sidechains. These networks are teined with protein crystal structures. The prediction of hydration is improved by adding information on secondary...... structure and solvent accessibility and, using actual values of these properties, redidue hydration can be predicted to 77% accuracy with a Metthews coefficient of 0.43. However, predicted property data with an accuracy of 60-70% result in less than half the improvement in predictive performance observed...
Prediction of heterodimeric protein complexes from weighted protein-protein interaction networks using novel features and kernel functions.

Directory of Open Access Journals (Sweden)

Peiying Ruan

Full Text Available Since many proteins express their functional activity by interacting with other proteins and forming protein complexes, it is very useful to identify sets of proteins that form complexes. For that purpose, many prediction methods for protein complexes from protein-protein interactions have been developed such as MCL, MCODE, RNSC, PCP, RRW, and NWE. These methods have dealt with only complexes with size of more than three because the methods often are based on some density of subgraphs. However, heterodimeric protein complexes that consist of two distinct proteins occupy a large part according to several comprehensive databases of known complexes. In this paper, we propose several feature space mappings from protein-protein interaction data, in which each interaction is weighted based on reliability. Furthermore, we make use of prior knowledge on protein domains to develop feature space mappings, domain composition kernel and its combination kernel with our proposed features. We perform ten-fold cross-validation computational experiments. These results suggest that our proposed kernel considerably outperforms the naive Bayes-based method, which is the best existing method for predicting heterodimeric protein complexes.
Exploration of the dynamic properties of protein complexes predicted from spatially constrained protein-protein interaction networks.

Directory of Open Access Journals (Sweden)

Eric A Yen

2014-05-01

Full Text Available Protein complexes are not static, but rather highly dynamic with subunits that undergo 1-dimensional diffusion with respect to each other. Interactions within protein complexes are modulated through regulatory inputs that alter interactions and introduce new components and deplete existing components through exchange. While it is clear that the structure and function of any given protein complex is coupled to its dynamical properties, it remains a challenge to predict the possible conformations that complexes can adopt. Protein-fragment Complementation Assays detect physical interactions between protein pairs constrained to ≤8 nm from each other in living cells. This method has been used to build networks composed of 1000s of pair-wise interactions. Significantly, these networks contain a wealth of dynamic information, as the assay is fully reversible and the proteins are expressed in their natural context. In this study, we describe a method that extracts this valuable information in the form of predicted conformations, allowing the user to explore the conformational landscape, to search for structures that correlate with an activity state, and estimate the abundance of conformations in the living cell. The generator is based on a Markov Chain Monte Carlo simulation that uses the interaction dataset as input and is constrained by the physical resolution of the assay. We applied this method to an 18-member protein complex composed of the seven core proteins of the budding yeast Arp2/3 complex and 11 associated regulators and effector proteins. We generated 20,480 output structures and identified conformational states using principle component analysis. We interrogated the conformation landscape and found evidence of symmetry breaking, a mixture of likely active and inactive conformational states and dynamic exchange of the core protein Arc15 between core and regulatory components. Our method provides a novel tool for prediction and
Improving N-terminal protein annotation of Plasmodium species based on signal peptide prediction of orthologous proteins

Directory of Open Access Journals (Sweden)

Neto Armando

2012-11-01

Full Text Available Abstract Background Signal peptide is one of the most important motifs involved in protein trafficking and it ultimately influences protein function. Considering the expected functional conservation among orthologs it was hypothesized that divergence in signal peptides within orthologous groups is mainly due to N-terminal protein sequence misannotation. Thus, discrepancies in signal peptide prediction of orthologous proteins were used to identify misannotated proteins in five Plasmodium species. Methods Signal peptide (SignalP and orthology (OrthoMCL were combined in an innovative strategy to identify orthologous groups showing discrepancies in signal peptide prediction among their protein members (Mixed groups. In a comparative analysis, multiple alignments for each of these groups and gene models were visually inspected in search of misannotated proteins and, whenever possible, alternative gene models were proposed. Thresholds for signal peptide prediction parameters were also modified to reduce their impact as a possible source of discrepancy among orthologs. Validation of new gene models was based on RT-PCR (few examples or on experimental evidence already published (ApiLoc. Results The rate of misannotated proteins was significantly higher in Mixed groups than in Positive or Negative groups, corroborating the proposed hypothesis. A total of 478 proteins were reannotated and change of signal peptide prediction from negative to positive was the most common. Reannotations triggered the conversion of almost 50% of all Mixed groups, which were further reduced by optimization of signal peptide prediction parameters. Conclusions The methodological novelty proposed here combining orthology and signal peptide prediction proved to be an effective strategy for the identification of proteins showing wrongly N-terminal annotated sequences, and it might have an important impact in the available data for genome-wide searching of potential vaccine and drug
Protein (multi-)location prediction: utilizing interdependencies via a generative model

Science.gov (United States)

Shatkay, Hagit

2015-01-01

Motivation: Proteins are responsible for a multitude of vital tasks in all living organisms. Given that a protein’s function and role are strongly related to its subcellular location, protein location prediction is an important research area. While proteins move from one location to another and can localize to multiple locations, most existing location prediction systems assign only a single location per protein. A few recent systems attempt to predict multiple locations for proteins, however, their performance leaves much room for improvement. Moreover, such systems do not capture dependencies among locations and usually consider locations as independent. We hypothesize that a multi-location predictor that captures location inter-dependencies can improve location predictions for proteins. Results: We introduce a probabilistic generative model for protein localization, and develop a system based on it—which we call MDLoc—that utilizes inter-dependencies among locations to predict multiple locations for proteins. The model captures location inter-dependencies using Bayesian networks and represents dependency between features and locations using a mixture model. We use iterative processes for learning model parameters and for estimating protein locations. We evaluate our classifier MDLoc, on a dataset of single- and multi-localized proteins derived from the DBMLoc dataset, which is the most comprehensive protein multi-localization dataset currently available. Our results, obtained by using MDLoc, significantly improve upon results obtained by an initial simpler classifier, as well as on results reported by other top systems. Availability and implementation: MDLoc is available at: http://www.eecis.udel.edu/∼compbio/mdloc. Contact: shatkay@udel.edu. PMID:26072505
Improving protein-protein interaction prediction using evolutionary information from low-quality MSAs.

Science.gov (United States)

Várnai, Csilla; Burkoff, Nikolas S; Wild, David L

2017-01-01

Evolutionary information stored in multiple sequence alignments (MSAs) has been used to identify the interaction interface of protein complexes, by measuring either co-conservation or co-mutation of amino acid residues across the interface. Recently, maximum entropy related correlated mutation measures (CMMs) such as direct information, decoupling direct from indirect interactions, have been developed to identify residue pairs interacting across the protein complex interface. These studies have focussed on carefully selected protein complexes with large, good-quality MSAs. In this work, we study protein complexes with a more typical MSA consisting of fewer than 400 sequences, using a set of 79 intramolecular protein complexes. Using a maximum entropy based CMM at the residue level, we develop an interface level CMM score to be used in re-ranking docking decoys. We demonstrate that our interface level CMM score compares favourably to the complementarity trace score, an evolutionary information-based score measuring co-conservation, when combined with the number of interface residues, a knowledge-based potential and the variability score of individual amino acid sites. We also demonstrate, that, since co-mutation and co-complementarity in the MSA contain orthogonal information, the best prediction performance using evolutionary information can be achieved by combining the co-mutation information of the CMM with co-conservation information of a complementarity trace score, predicting a near-native structure as the top prediction for 41% of the dataset. The method presented is not restricted to small MSAs, and will likely improve interface prediction also for complexes with large and good-quality MSAs.
SitesIdentify: a protein functional site prediction tool

Directory of Open Access Journals (Sweden)

Doig Andrew J

2009-11-01

Full Text Available Abstract Background The rate of protein structures being deposited in the Protein Data Bank surpasses the capacity to experimentally characterise them and therefore computational methods to analyse these structures have become increasingly important. Identifying the region of the protein most likely to be involved in function is useful in order to gain information about its potential role. There are many available approaches to predict functional site, but many are not made available via a publicly-accessible application. Results Here we present a functional site prediction tool (SitesIdentify, based on combining sequence conservation information with geometry-based cleft identification, that is freely available via a web-server. We have shown that SitesIdentify compares favourably to other functional site prediction tools in a comparison of seven methods on a non-redundant set of 237 enzymes with annotated active sites. Conclusion SitesIdentify is able to produce comparable accuracy in predicting functional sites to its closest available counterpart, but in addition achieves improved accuracy for proteins with few characterised homologues. SitesIdentify is available via a webserver at http://www.manchester.ac.uk/bioinformatics/sitesidentify/
Metabolizable energy intake of client-owned adult dogs.

Science.gov (United States)

Thes, M; Koeber, N; Fritz, J; Wendel, F; Dillitzer, N; Dobenecker, B; Kienzle, E

2016-10-01

A post hoc analysis of the metabolizable energy (ME) intake of privately owned pet dogs from the authors' nutrition consultation practice (Years 2007-2011) was carried out to identify if current ME recommendations are suitable for pet dogs. Data on 586 adult dogs were available (median age 5.5, median deviation from ideal weight 0.0), 55 of them were healthy; the others had various diseases. For ration calculation, a standardized questionnaire and the software diet-check Munich(™) was used. ME was predicted according to NRC (2006). Data were evaluated for the factors disease, breed, size, age, gender and type of feeding. The mean ME intake of all adult dogs amounted to 0.410 ± 0.121 MJ/kg metabolic body weight (BW(0.75) ) (n = 586). There was no effect of size and disease. Overweight dogs ate 0.360 ± 0.121 MJ/kg BW(0.75) , and underweight dogs ate 0.494 ± 0.159 MJ/kg BW(0.75) . Older dogs (>7 years, n = 149, 0.389 ± 0.105 MJ/kg BW(0.75) ) had a lower ME intake than younger ones (n = 313, 0.419 ± 0.121 MJ/kg BW(0.75) ), and intact males had a higher ME intake than the others (p Hounds, German Boxers, English foxhounds, Rhodesian Ridgebacks and Flat-Coated Retrievers with a mean ME intake of 0.473 ± 0.121 MJ/kg BW(0.75) . The following breeds were below average: Dachshunds, Bichons, West highland White Terrier, Collies except Bearded Collies, Airedale Terriers, American Staffordshire terriers and Golden Retrievers with a mean ME intake of 0.343 ± 0.096 MJ/kg BW(0.75) . The mean maintenance energy requirements of pet dogs are similar to that of kennel dogs which do not exercise very much. These results suggest that opportunity and stimulus to exercise provided for pet dogs are lower than for kennel dogs. Lower activity in pet dogs may reduce part of potential effects of breed, medical history and age groups. Journal of Animal Physiology and Animal Nutrition © 2016 Blackwell Verlag GmbH.
Prediction of methyl-side Chain Dynamics in Proteins

International Nuclear Information System (INIS)

Ming Dengming; Brueschweiler, Rafael

2004-01-01

A simple analytical model is presented for the prediction of methyl-side chain dynamics in comparison with S 2 order parameters obtained by NMR relaxation spectroscopy. The model, which is an extension of the local contact model for backbone order parameter prediction, uses a static 3D protein structure as input. It expresses the methyl-group S 2 order parameters as a function of local contacts of the methyl carbon with respect to the neighboring atoms in combination with the number of consecutive mobile dihedral angles between the methyl group and the protein backbone. For six out of seven proteins the prediction results are good when compared with experimentally determined methyl-group S 2 values with an average correlation coefficient r-bar=0.65±0.14. For the unusually rigid cytochrome c 2 no significant correlation between prediction and experiment is found. The presented model provides independent support for the reliability of current side-chain relaxation methods along with their interpretation by the model-free formalism
Utilizing knowledge base of amino acids structural neighborhoods to predict protein-protein interaction sites.

Science.gov (United States)

Jelínek, Jan; Škoda, Petr; Hoksza, David

2017-12-06

Protein-protein interactions (PPI) play a key role in an investigation of various biochemical processes, and their identification is thus of great importance. Although computational prediction of which amino acids take part in a PPI has been an active field of research for some time, the quality of in-silico methods is still far from perfect. We have developed a novel prediction method called INSPiRE which benefits from a knowledge base built from data available in Protein Data Bank. All proteins involved in PPIs were converted into labeled graphs with nodes corresponding to amino acids and edges to pairs of neighboring amino acids. A structural neighborhood of each node was then encoded into a bit string and stored in the knowledge base. When predicting PPIs, INSPiRE labels amino acids of unknown proteins as interface or non-interface based on how often their structural neighborhood appears as interface or non-interface in the knowledge base. We evaluated INSPiRE's behavior with respect to different types and sizes of the structural neighborhood. Furthermore, we examined the suitability of several different features for labeling the nodes. Our evaluations showed that INSPiRE clearly outperforms existing methods with respect to Matthews correlation coefficient. In this paper we introduce a new knowledge-based method for identification of protein-protein interaction sites called INSPiRE. Its knowledge base utilizes structural patterns of known interaction sites in the Protein Data Bank which are then used for PPI prediction. Extensive experiments on several well-established datasets show that INSPiRE significantly surpasses existing PPI approaches.
Improving accuracy of protein-protein interaction prediction by considering the converse problem for sequence representation

Directory of Open Access Journals (Sweden)

Wang Yong

2011-10-01

Full Text Available Abstract Background With the development of genome-sequencing technologies, protein sequences are readily obtained by translating the measured mRNAs. Therefore predicting protein-protein interactions from the sequences is of great demand. The reason lies in the fact that identifying protein-protein interactions is becoming a bottleneck for eventually understanding the functions of proteins, especially for those organisms barely characterized. Although a few methods have been proposed, the converse problem, if the features used extract sufficient and unbiased information from protein sequences, is almost untouched. Results In this study, we interrogate this problem theoretically by an optimization scheme. Motivated by the theoretical investigation, we find novel encoding methods for both protein sequences and protein pairs. Our new methods exploit sufficiently the information of protein sequences and reduce artificial bias and computational cost. Thus, it significantly outperforms the available methods regarding sensitivity, specificity, precision, and recall with cross-validation evaluation and reaches ~80% and ~90% accuracy in Escherichia coli and Saccharomyces cerevisiae respectively. Our findings here hold important implication for other sequence-based prediction tasks because representation of biological sequence is always the first step in computational biology. Conclusions By considering the converse problem, we propose new representation methods for both protein sequences and protein pairs. The results show that our method significantly improves the accuracy of protein-protein interaction predictions.
Protein structure prediction using bee colony optimization metaheuristic

DEFF Research Database (Denmark)

Fonseca, Rasmus; Paluszewski, Martin; Winter, Pawel

2010-01-01

of the proteins structure, an energy potential and some optimization algorithm that ¿nds the structure with minimal energy. Bee Colony Optimization (BCO) is a relatively new approach to solving opti- mization problems based on the foraging behaviour of bees. Several variants of BCO have been suggested......Predicting the native structure of proteins is one of the most challenging problems in molecular biology. The goal is to determine the three-dimensional struc- ture from the one-dimensional amino acid sequence. De novo prediction algorithms seek to do this by developing a representation...... our BCO method to generate good solutions to the protein structure prediction problem. The results show that BCO generally ¿nds better solutions than simulated annealing which so far has been the metaheuristic of choice for this problem....
DomPep--a general method for predicting modular domain-mediated protein-protein interactions.

Directory of Open Access Journals (Sweden)

Lei Li

Full Text Available Protein-protein interactions (PPIs are frequently mediated by the binding of a modular domain in one protein to a short, linear peptide motif in its partner. The advent of proteomic methods such as peptide and protein arrays has led to the accumulation of a wealth of interaction data for modular interaction domains. Although several computational programs have been developed to predict modular domain-mediated PPI events, they are often restricted to a given domain type. We describe DomPep, a method that can potentially be used to predict PPIs mediated by any modular domains. DomPep combines proteomic data with sequence information to achieve high accuracy and high coverage in PPI prediction. Proteomic binding data were employed to determine a simple yet novel parameter Ligand-Binding Similarity which, in turn, is used to calibrate Domain Sequence Identity and Position-Weighted-Matrix distance, two parameters that are used in constructing prediction models. Moreover, DomPep can be used to predict PPIs for both domains with experimental binding data and those without. Using the PDZ and SH2 domain families as test cases, we show that DomPep can predict PPIs with accuracies superior to existing methods. To evaluate DomPep as a discovery tool, we deployed DomPep to identify interactions mediated by three human PDZ domains. Subsequent in-solution binding assays validated the high accuracy of DomPep in predicting authentic PPIs at the proteome scale. Because DomPep makes use of only interaction data and the primary sequence of a domain, it can be readily expanded to include other types of modular domains.
Preclinical models used for immunogenicity prediction of therapeutic proteins.

Science.gov (United States)

Brinks, Vera; Weinbuch, Daniel; Baker, Matthew; Dean, Yann; Stas, Philippe; Kostense, Stefan; Rup, Bonita; Jiskoot, Wim

2013-07-01

All therapeutic proteins are potentially immunogenic. Antibodies formed against these drugs can decrease efficacy, leading to drastically increased therapeutic costs and in rare cases to serious and sometimes life threatening side-effects. Many efforts are therefore undertaken to develop therapeutic proteins with minimal immunogenicity. For this, immunogenicity prediction of candidate drugs during early drug development is essential. Several in silico, in vitro and in vivo models are used to predict immunogenicity of drug leads, to modify potentially immunogenic properties and to continue development of drug candidates with expected low immunogenicity. Despite the extensive use of these predictive models, their actual predictive value varies. Important reasons for this uncertainty are the limited/insufficient knowledge on the immune mechanisms underlying immunogenicity of therapeutic proteins, the fact that different predictive models explore different components of the immune system and the lack of an integrated clinical validation. In this review, we discuss the predictive models in use, summarize aspects of immunogenicity that these models predict and explore the merits and the limitations of each of the models.
Predicting turns in proteins with a unified model.

Directory of Open Access Journals (Sweden)

Qi Song

Full Text Available MOTIVATION: Turns are a critical element of the structure of a protein; turns play a crucial role in loops, folds, and interactions. Current prediction methods are well developed for the prediction of individual turn types, including α-turn, β-turn, and γ-turn, etc. However, for further protein structure and function prediction it is necessary to develop a uniform model that can accurately predict all types of turns simultaneously. RESULTS: In this study, we present a novel approach, TurnP, which offers the ability to investigate all the turns in a protein based on a unified model. The main characteristics of TurnP are: (i using newly exploited features of structural evolution information (secondary structure and shape string of protein based on structure homologies, (ii considering all types of turns in a unified model, and (iii practical capability of accurate prediction of all turns simultaneously for a query. TurnP utilizes predicted secondary structures and predicted shape strings, both of which have greater accuracy, based on innovative technologies which were both developed by our group. Then, sequence and structural evolution features, which are profile of sequence, profile of secondary structures and profile of shape strings are generated by sequence and structure alignment. When TurnP was validated on a non-redundant dataset (4,107 entries by five-fold cross-validation, we achieved an accuracy of 88.8% and a sensitivity of 71.8%, which exceeded the most state-of-the-art predictors of certain type of turn. Newly determined sequences, the EVA and CASP9 datasets were used as independent tests and the results we achieved were outstanding for turn predictions and confirmed the good performance of TurnP for practical applications.
Prediction and Dissection of Protein-RNA Interactions by Molecular Descriptors.

Science.gov (United States)

Liu, Zhi-Ping; Chen, Luonan

2016-01-01

Protein-RNA interactions play crucial roles in numerous biological processes. However, detecting the interactions and binding sites between protein and RNA by traditional experiments is still time consuming and labor costing. Thus, it is of importance to develop bioinformatics methods for predicting protein-RNA interactions and binding sites. Accurate prediction of protein-RNA interactions and recognitions will highly benefit to decipher the interaction mechanisms between protein and RNA, as well as to improve the RNA-related protein engineering and drug design. In this work, we summarize the current bioinformatics strategies of predicting protein-RNA interactions and dissecting protein-RNA interaction mechanisms from local structure binding motifs. In particular, we focus on the feature-based machine learning methods, in which the molecular descriptors of protein and RNA are extracted and integrated as feature vectors of representing the interaction events and recognition residues. In addition, the available methods are classified and compared comprehensively. The molecular descriptors are expected to elucidate the binding mechanisms of protein-RNA interaction and reveal the functional implications from structural complementary perspective.
ENERGY AND PROTEIN REQUIREMENTS OF GROWING PELIBUEY SHEEP UNDER TROPICAL CONDITIONS ESTIMATED FROM A LITERATURE DATABASE ANALYSES

Directory of Open Access Journals (Sweden)

Fernando Duarte

2012-01-01

Full Text Available Data from previous studies were used to estimate the metabolizable energy and protein requirements for maintenance and growth and basal metabolism energy requirement of male Pelibuey sheep under tropical conditions were estimated. In addition, empty body weight and mature weight of males and female Pelibuey sheep were also estimated. Basal metabolism energy requirements were estimated with the Cornell Net Carbohydrate and Protein System â€“ Sheep (CNCPS-S model using the a1 factor of the maintenance equation. Mature weight was estimated to be 69 kg for males and 45 kg for females. Empty body weight was estimated to be 81% of live weight. Metabolizable energy and protein requirements for growth were 0.106 Mcal MEm/kg LW0.75 and 2.4 g MP/kg LW0.75 for males. The collected information did not allowed appropriate estimation of female requirements. The basal metabolism energy requirement was estimated to be 0.039 Mcal MEm/kg LW0.75. Energy requirements for basal metabolism were lower in Pelibuey sheep than those reported for wool breeds even though their total requirements were similar.
Scoring protein relationships in functional interaction networks predicted from sequence data.

Directory of Open Access Journals (Sweden)

Gaston K Mazandu

Full Text Available UNLABELLED: The abundance of diverse biological data from various sources constitutes a rich source of knowledge, which has the power to advance our understanding of organisms. This requires computational methods in order to integrate and exploit these data effectively and elucidate local and genome wide functional connections between protein pairs, thus enabling functional inferences for uncharacterized proteins. These biological data are primarily in the form of sequences, which determine functions, although functional properties of a protein can often be predicted from just the domains it contains. Thus, protein sequences and domains can be used to predict protein pair-wise functional relationships, and thus contribute to the function prediction process of uncharacterized proteins in order to ensure that knowledge is gained from sequencing efforts. In this work, we introduce information-theoretic based approaches to score protein-protein functional interaction pairs predicted from protein sequence similarity and conserved protein signature matches. The proposed schemes are effective for data-driven scoring of connections between protein pairs. We applied these schemes to the Mycobacterium tuberculosis proteome to produce a homology-based functional network of the organism with a high confidence and coverage. We use the network for predicting functions of uncharacterised proteins. AVAILABILITY: Protein pair-wise functional relationship scores for Mycobacterium tuberculosis strain CDC1551 sequence data and python scripts to compute these scores are available at http://web.cbio.uct.ac.za/~gmazandu/scoringschemes.
MetaGO: Predicting Gene Ontology of Non-homologous Proteins Through Low-Resolution Protein Structure Prediction and Protein-Protein Network Mapping.

Science.gov (United States)

Zhang, Chengxin; Zheng, Wei; Freddolino, Peter L; Zhang, Yang

2018-03-10

Homology-based transferal remains the major approach to computational protein function annotations, but it becomes increasingly unreliable when the sequence identity between query and template decreases below 30%. We propose a novel pipeline, MetaGO, to deduce Gene Ontology attributes of proteins by combining sequence homology-based annotation with low-resolution structure prediction and comparison, and partner's homology-based protein-protein network mapping. The pipeline was tested on a large-scale set of 1000 non-redundant proteins from the CAFA3 experiment. Under the stringent benchmark conditions where templates with >30% sequence identity to the query are excluded, MetaGO achieves average F-measures of 0.487, 0.408, and 0.598, for Molecular Function, Biological Process, and Cellular Component, respectively, which are significantly higher than those achieved by other state-of-the-art function annotations methods. Detailed data analysis shows that the major advantage of the MetaGO lies in the new functional homolog detections from partner's homology-based network mapping and structure-based local and global structure alignments, the confidence scores of which can be optimally combined through logistic regression. These data demonstrate the power of using a hybrid model incorporating protein structure and interaction networks to deduce new functional insights beyond traditional sequence homology-based referrals, especially for proteins that lack homologous function templates. The MetaGO pipeline is available at http://zhanglab.ccmb.med.umich.edu/MetaGO/. Copyright © 2018. Published by Elsevier Ltd.

DeepLoc: prediction of protein subcellular localization using deep learning

DEFF Research Database (Denmark)

Almagro Armenteros, Jose Juan; Sønderby, Casper Kaae; Sønderby, Søren Kaae

2017-01-01

The prediction of eukaryotic protein subcellular localization is a well-studied topic in bioinformatics due to its relevance in proteomics research. Many machine learning methods have been successfully applied in this task, but in most of them, predictions rely on annotation of homologues from...... knowledge databases. For novel proteins where no annotated homologues exist, and for predicting the effects of sequence variants, it is desirable to have methods for predicting protein properties from sequence information only. Here, we present a prediction algorithm using deep neural networks to predict...... current state-of-the-art algorithms, including those relying on homology information. The method is available as a web server at http://www.cbs.dtu.dk/services/DeepLoc . Example code is available at https://github.com/JJAlmagro/subcellular_localization . The dataset is available at http...
Stringent DDI-based prediction of H. sapiens-M. tuberculosis H37Rv protein-protein interactions.

Science.gov (United States)

Zhou, Hufeng; Rezaei, Javad; Hugo, Willy; Gao, Shangzhi; Jin, Jingjing; Fan, Mengyuan; Yong, Chern-Han; Wozniak, Michal; Wong, Limsoon

2013-01-01

H. sapiens-M. tuberculosis H37Rv protein-protein interaction (PPI) data are very important information to illuminate the infection mechanism of M. tuberculosis H37Rv. But current H. sapiens-M. tuberculosis H37Rv PPI data are very scarce. This seriously limits the study of the interaction between this important pathogen and its host H. sapiens. Computational prediction of H. sapiens-M. tuberculosis H37Rv PPIs is an important strategy to fill in the gap. Domain-domain interaction (DDI) based prediction is one of the frequently used computational approaches in predicting both intra-species and inter-species PPIs. However, the performance of DDI-based host-pathogen PPI prediction has been rather limited. We develop a stringent DDI-based prediction approach with emphasis on (i) differences between the specific domain sequences on annotated regions of proteins under the same domain ID and (ii) calculation of the interaction strength of predicted PPIs based on the interacting residues in their interaction interfaces. We compare our stringent DDI-based approach to a conventional DDI-based approach for predicting PPIs based on gold standard intra-species PPIs and coherent informative Gene Ontology terms assessment. The assessment results show that our stringent DDI-based approach achieves much better performance in predicting PPIs than the conventional approach. Using our stringent DDI-based approach, we have predicted a small set of reliable H. sapiens-M. tuberculosis H37Rv PPIs which could be very useful for a variety of related studies. We also analyze the H. sapiens-M. tuberculosis H37Rv PPIs predicted by our stringent DDI-based approach using cellular compartment distribution analysis, functional category enrichment analysis and pathway enrichment analysis. The analyses support the validity of our prediction result. Also, based on an analysis of the H. sapiens-M. tuberculosis H37Rv PPI network predicted by our stringent DDI-based approach, we have discovered some
Quantification of the main digestive processes in ruminants: the equations involved in the renewed energy and protein feed evaluation systems.

Science.gov (United States)

Sauvant, D; Nozière, P

2016-05-01

The evolution of feeding systems for ruminants towards evaluation of diets in terms of multiple responses requires the updating of the calculation of nutrient supply to the animals to make it more accurate on aggregated units (feed unit, or UF, for energy and protein digestible in the intestine, or PDI, for metabolizable protein) and to allow prediction of absorbed nutrients. The present update of the French system is based on the building and interpretation through meta-analysis of large databases on digestion and nutrition of ruminants. Equations involved in the calculation of UF and PDI have been updated, allowing: (1) prediction of the out flow rate of particles and liquid depending on the level of intake and the proportion of concentrate, and the use of this in the calculation of ruminal digestion of protein and starch from in situ data; (2) the system to take into account the effects of the main factors of digestive interactions (level of intake, proportion of concentrate, rumen protein balance) on organic matter digestibility, energy losses in methane and in urine; (3) more accurate calculation of the energy available in the rumen and the efficiency of its use for the microbial protein synthesis. In this renewed model UF and PDI values of feedstuffs vary depending on diet composition, and intake level. Consequently, standard feed table values can be considered as being only indicative. It is thus possible to predict the nutrient supply on a wider range of diets more accurately and in particular to better integrate energy×protein interactions occurring in the gut.
Predicting membrane protein types by fusing composite protein sequence features into pseudo amino acid composition.

Science.gov (United States)

Hayat, Maqsood; Khan, Asifullah

2011-02-21

Membrane proteins are vital type of proteins that serve as channels, receptors, and energy transducers in a cell. Prediction of membrane protein types is an important research area in bioinformatics. Knowledge of membrane protein types provides some valuable information for predicting novel example of the membrane protein types. However, classification of membrane protein types can be both time consuming and susceptible to errors due to the inherent similarity of membrane protein types. In this paper, neural networks based membrane protein type prediction system is proposed. Composite protein sequence representation (CPSR) is used to extract the features of a protein sequence, which includes seven feature sets; amino acid composition, sequence length, 2 gram exchange group frequency, hydrophobic group, electronic group, sum of hydrophobicity, and R-group. Principal component analysis is then employed to reduce the dimensionality of the feature vector. The probabilistic neural network (PNN), generalized regression neural network, and support vector machine (SVM) are used as classifiers. A high success rate of 86.01% is obtained using SVM for the jackknife test. In case of independent dataset test, PNN yields the highest accuracy of 95.73%. These classifiers exhibit improved performance using other performance measures such as sensitivity, specificity, Mathew's correlation coefficient, and F-measure. The experimental results show that the prediction performance of the proposed scheme for classifying membrane protein types is the best reported, so far. This performance improvement may largely be credited to the learning capabilities of neural networks and the composite feature extraction strategy, which exploits seven different properties of protein sequences. The proposed Mem-Predictor can be accessed at http://111.68.99.218/Mem-Predictor. Copyright Â© 2010 Elsevier Ltd. All rights reserved.
Semi-supervised drug-protein interaction prediction from heterogeneous biological spaces.

Science.gov (United States)

Xia, Zheng; Wu, Ling-Yun; Zhou, Xiaobo; Wong, Stephen T C

2010-09-13

Predicting drug-protein interactions from heterogeneous biological data sources is a key step for in silico drug discovery. The difficulty of this prediction task lies in the rarity of known drug-protein interactions and myriad unknown interactions to be predicted. To meet this challenge, a manifold regularization semi-supervised learning method is presented to tackle this issue by using labeled and unlabeled information which often generates better results than using the labeled data alone. Furthermore, our semi-supervised learning method integrates known drug-protein interaction network information as well as chemical structure and genomic sequence data. Using the proposed method, we predicted certain drug-protein interactions on the enzyme, ion channel, GPCRs, and nuclear receptor data sets. Some of them are confirmed by the latest publicly available drug targets databases such as KEGG. We report encouraging results of using our method for drug-protein interaction network reconstruction which may shed light on the molecular interaction inference and new uses of marketed drugs.
Structural features that predict real-value fluctuations of globular proteins.

Science.gov (United States)

Jamroz, Michal; Kolinski, Andrzej; Kihara, Daisuke

2012-05-01

It is crucial to consider dynamics for understanding the biological function of proteins. We used a large number of molecular dynamics (MD) trajectories of nonhomologous proteins as references and examined static structural features of proteins that are most relevant to fluctuations. We examined correlation of individual structural features with fluctuations and further investigated effective combinations of features for predicting the real value of residue fluctuations using the support vector regression (SVR). It was found that some structural features have higher correlation than crystallographic B-factors with fluctuations observed in MD trajectories. Moreover, SVR that uses combinations of static structural features showed accurate prediction of fluctuations with an average Pearson's correlation coefficient of 0.669 and a root mean square error of 1.04 Å. This correlation coefficient is higher than the one observed in predictions by the Gaussian network model (GNM). An advantage of the developed method over the GNMs is that the former predicts the real value of fluctuation. The results help improve our understanding of relationships between protein structure and fluctuation. Furthermore, the developed method provides a convienient practial way to predict fluctuations of proteins using easily computed static structural features of proteins. Copyright © 2012 Wiley Periodicals, Inc.
Protein Sub-Nuclear Localization Prediction Using SVM and Pfam Domain Information

Science.gov (United States)

Kumar, Ravindra; Jain, Sohni; Kumari, Bandana; Kumar, Manish

2014-01-01

The nucleus is the largest and the highly organized organelle of eukaryotic cells. Within nucleus exist a number of pseudo-compartments, which are not separated by any membrane, yet each of them contains only a specific set of proteins. Understanding protein sub-nuclear localization can hence be an important step towards understanding biological functions of the nucleus. Here we have described a method, SubNucPred developed by us for predicting the sub-nuclear localization of proteins. This method predicts protein localization for 10 different sub-nuclear locations sequentially by combining presence or absence of unique Pfam domain and amino acid composition based SVM model. The prediction accuracy during leave-one-out cross-validation for centromeric proteins was 85.05%, for chromosomal proteins 76.85%, for nuclear speckle proteins 81.27%, for nucleolar proteins 81.79%, for nuclear envelope proteins 79.37%, for nuclear matrix proteins 77.78%, for nucleoplasm proteins 76.98%, for nuclear pore complex proteins 88.89%, for PML body proteins 75.40% and for telomeric proteins it was 83.33%. Comparison with other reported methods showed that SubNucPred performs better than existing methods. A web-server for predicting protein sub-nuclear localization named SubNucPred has been established at http://14.139.227.92/mkumar/subnucpred/. Standalone version of SubNucPred can also be downloaded from the web-server. PMID:24897370
Protein secondary structure: category assignment and predictability

DEFF Research Database (Denmark)

Andersen, Claus A.; Bohr, Henrik; Brunak, Søren

2001-01-01

In the last decade, the prediction of protein secondary structure has been optimized using essentially one and the same assignment scheme known as DSSP. We present here a different scheme, which is more predictable. This scheme predicts directly the hydrogen bonds, which stabilize the secondary......-forward neural network with one hidden layer on a data set identical to the one used in earlier work....
Maternal protein-energy malnutrition during early pregnancy in sheep impacts the fetal ornithine cycle to reduce fetal kidney microvascular development.

OpenAIRE

Dunford, L. J.; Sinclair, K. D.; Kwong, W. Y.; Sturrock, C.; Clifford, B. L.; Giles, T. C.; Gardner, D. S.

2014-01-01

This paper identifies a common nutritional pathway relating maternal through to fetal protein-energy malnutrition (PEM) and compromised fetal kidney development. Thirty-one twin-bearing sheep were fed either a control (n=15) or low-protein diet (n=16, 17 vs. 8.7 g crude protein/MJ metabolizable energy) from d 0 to 65 gestation (term, ?145 d). Effects on the maternal and fetal nutritional environment were characterized by sampling blood and amniotic fluid. Kidney development was characterized ...
HitPredict version 4: comprehensive reliability scoring of physical protein-protein interactions from more than 100 species.

Science.gov (United States)

López, Yosvany; Nakai, Kenta; Patil, Ashwini

2015-01-01

HitPredict is a consolidated resource of experimentally identified, physical protein-protein interactions with confidence scores to indicate their reliability. The study of genes and their inter-relationships using methods such as network and pathway analysis requires high quality protein-protein interaction information. Extracting reliable interactions from most of the existing databases is challenging because they either contain only a subset of the available interactions, or a mixture of physical, genetic and predicted interactions. Automated integration of interactions is further complicated by varying levels of accuracy of database content and lack of adherence to standard formats. To address these issues, the latest version of HitPredict provides a manually curated dataset of 398 696 physical associations between 70 808 proteins from 105 species. Manual confirmation was used to resolve all issues encountered during data integration. For improved reliability assessment, this version combines a new score derived from the experimental information of the interactions with the original score based on the features of the interacting proteins. The combined interaction score performs better than either of the individual scores in HitPredict as well as the reliability score of another similar database. HitPredict provides a web interface to search proteins and visualize their interactions, and the data can be downloaded for offline analysis. Data usability has been enhanced by mapping protein identifiers across multiple reference databases. Thus, the latest version of HitPredict provides a significantly larger, more reliable and usable dataset of protein-protein interactions from several species for the study of gene groups. Database URL: http://hintdb.hgc.jp/htp. © The Author(s) 2015. Published by Oxford University Press.
Protein-Based Urine Test Predicts Kidney Transplant Outcomes

Science.gov (United States)

... News Releases News Release Thursday, August 22, 2013 Protein-based urine test predicts kidney transplant outcomes NIH- ... supporting development of noninvasive tests. Levels of a protein in the urine of kidney transplant recipients can ...
Prediction of Protein Thermostability by an Efficient Neural Network Approach

Directory of Open Access Journals (Sweden)

Jalal Rezaeenour

2016-10-01

Full Text Available Introduction: Manipulation of protein stability is important for understanding the principles that govern protein thermostability, both in basic research and industrial applications. Various data mining techniques exist for prediction of thermostable proteins. Furthermore, ANN methods have attracted significant attention for prediction of thermostability, because they constitute an appropriate approach to mapping the non-linear input-output relationships and massive parallel computing. Method: An Extreme Learning Machine (ELM was applied to estimate thermal behavior of 1289 proteins. In the proposed algorithm, the parameters of ELM were optimized using a Genetic Algorithm (GA, which tuned a set of input variables, hidden layer biases, and input weights, to and enhance the prediction performance. The method was executed on a set of amino acids, yielding a total of 613 protein features. A number of feature selection algorithms were used to build subsets of the features. A total of 1289 protein samples and 613 protein features were calculated from UniProt database to understand features contributing to the enzymes’ thermostability and find out the main features that influence this valuable characteristic. Results:At the primary structure level, Gln, Glu and polar were the features that mostly contributed to protein thermostability. At the secondary structure level, Helix_S, Coil, and charged_Coil were the most important features affecting protein thermostability. These results suggest that the thermostability of proteins is mainly associated with primary structural features of the protein. According to the results, the influence of primary structure on the thermostabilty of a protein was more important than that of the secondary structure. It is shown that prediction accuracy of ELM (mean square error can improve dramatically using GA with error rates RMSE=0.004 and MAPE=0.1003. Conclusion: The proposed approach for forecasting problem
Protein complex prediction via dense subgraphs and false positive analysis.

Directory of Open Access Journals (Sweden)

Cecilia Hernandez

Full Text Available Many proteins work together with others in groups called complexes in order to achieve a specific function. Discovering protein complexes is important for understanding biological processes and predict protein functions in living organisms. Large-scale and throughput techniques have made possible to compile protein-protein interaction networks (PPI networks, which have been used in several computational approaches for detecting protein complexes. Those predictions might guide future biologic experimental research. Some approaches are topology-based, where highly connected proteins are predicted to be complexes; some propose different clustering algorithms using partitioning, overlaps among clusters for networks modeled with unweighted or weighted graphs; and others use density of clusters and information based on protein functionality. However, some schemes still require much processing time or the quality of their results can be improved. Furthermore, most of the results obtained with computational tools are not accompanied by an analysis of false positives. We propose an effective and efficient mining algorithm for discovering highly connected subgraphs, which is our base for defining protein complexes. Our representation is based on transforming the PPI network into a directed acyclic graph that reduces the number of represented edges and the search space for discovering subgraphs. Our approach considers weighted and unweighted PPI networks. We compare our best alternative using PPI networks from Saccharomyces cerevisiae (yeast and Homo sapiens (human with state-of-the-art approaches in terms of clustering, biological metrics and execution times, as well as three gold standards for yeast and two for human. Furthermore, we analyze false positive predicted complexes searching the PDBe (Protein Data Bank in Europe database in order to identify matching protein complexes that have been purified and structurally characterized. Our analysis shows
Hidden markov model for the prediction of transmembrane proteins using MATLAB.

Science.gov (United States)

Chaturvedi, Navaneet; Shanker, Sudhanshu; Singh, Vinay Kumar; Sinha, Dhiraj; Pandey, Paras Nath

2011-01-01

Since membranous proteins play a key role in drug targeting therefore transmembrane proteins prediction is active and challenging area of biological sciences. Location based prediction of transmembrane proteins are significant for functional annotation of protein sequences. Hidden markov model based method was widely applied for transmembrane topology prediction. Here we have presented a revised and a better understanding model than an existing one for transmembrane protein prediction. Scripting on MATLAB was built and compiled for parameter estimation of model and applied this model on amino acid sequence to know the transmembrane and its adjacent locations. Estimated model of transmembrane topology was based on TMHMM model architecture. Only 7 super states are defined in the given dataset, which were converted to 96 states on the basis of their length in sequence. Accuracy of the prediction of model was observed about 74 %, is a good enough in the area of transmembrane topology prediction. Therefore we have concluded the hidden markov model plays crucial role in transmembrane helices prediction on MATLAB platform and it could also be useful for drug discovery strategy. The database is available for free at bioinfonavneet@gmail.comvinaysingh@bhu.ac.in.
Exploring the potential of 3D Zernike descriptors and SVM for protein-protein interface prediction.

Science.gov (United States)

Daberdaku, Sebastian; Ferrari, Carlo

2018-02-06

The correct determination of protein-protein interaction interfaces is important for understanding disease mechanisms and for rational drug design. To date, several computational methods for the prediction of protein interfaces have been developed, but the interface prediction problem is still not fully understood. Experimental evidence suggests that the location of binding sites is imprinted in the protein structure, but there are major differences among the interfaces of the various protein types: the characterising properties can vary a lot depending on the interaction type and function. The selection of an optimal set of features characterising the protein interface and the development of an effective method to represent and capture the complex protein recognition patterns are of paramount importance for this task. In this work we investigate the potential of a novel local surface descriptor based on 3D Zernike moments for the interface prediction task. Descriptors invariant to roto-translations are extracted from circular patches of the protein surface enriched with physico-chemical properties from the HQI8 amino acid index set, and are used as samples for a binary classification problem. Support Vector Machines are used as a classifier to distinguish interface local surface patches from non-interface ones. The proposed method was validated on 16 classes of proteins extracted from the Protein-Protein Docking Benchmark 5.0 and compared to other state-of-the-art protein interface predictors (SPPIDER, PrISE and NPS-HomPPI). The 3D Zernike descriptors are able to capture the similarity among patterns of physico-chemical and biochemical properties mapped on the protein surface arising from the various spatial arrangements of the underlying residues, and their usage can be easily extended to other sets of amino acid properties. The results suggest that the choice of a proper set of features characterising the protein interface is crucial for the interface prediction
Constraint Logic Programming approach to protein structure prediction

Directory of Open Access Journals (Sweden)

Fogolari Federico

2004-11-01

Full Text Available Abstract Background The protein structure prediction problem is one of the most challenging problems in biological sciences. Many approaches have been proposed using database information and/or simplified protein models. The protein structure prediction problem can be cast in the form of an optimization problem. Notwithstanding its importance, the problem has very seldom been tackled by Constraint Logic Programming, a declarative programming paradigm suitable for solving combinatorial optimization problems. Results Constraint Logic Programming techniques have been applied to the protein structure prediction problem on the face-centered cube lattice model. Molecular dynamics techniques, endowed with the notion of constraint, have been also exploited. Even using a very simplified model, Constraint Logic Programming on the face-centered cube lattice model allowed us to obtain acceptable results for a few small proteins. As a test implementation their (known secondary structure and the presence of disulfide bridges are used as constraints. Simplified structures obtained in this way have been converted to all atom models with plausible structure. Results have been compared with a similar approach using a well-established technique as molecular dynamics. Conclusions The results obtained on small proteins show that Constraint Logic Programming techniques can be employed for studying protein simplified models, which can be converted into realistic all atom models. The advantage of Constraint Logic Programming over other, much more explored, methodologies, resides in the rapid software prototyping, in the easy way of encoding heuristics, and in exploiting all the advances made in this research area, e.g. in constraint propagation and its use for pruning the huge search space.
Constraint Logic Programming approach to protein structure prediction.

Science.gov (United States)

Dal Palù, Alessandro; Dovier, Agostino; Fogolari, Federico

2004-11-30

The protein structure prediction problem is one of the most challenging problems in biological sciences. Many approaches have been proposed using database information and/or simplified protein models. The protein structure prediction problem can be cast in the form of an optimization problem. Notwithstanding its importance, the problem has very seldom been tackled by Constraint Logic Programming, a declarative programming paradigm suitable for solving combinatorial optimization problems. Constraint Logic Programming techniques have been applied to the protein structure prediction problem on the face-centered cube lattice model. Molecular dynamics techniques, endowed with the notion of constraint, have been also exploited. Even using a very simplified model, Constraint Logic Programming on the face-centered cube lattice model allowed us to obtain acceptable results for a few small proteins. As a test implementation their (known) secondary structure and the presence of disulfide bridges are used as constraints. Simplified structures obtained in this way have been converted to all atom models with plausible structure. Results have been compared with a similar approach using a well-established technique as molecular dynamics. The results obtained on small proteins show that Constraint Logic Programming techniques can be employed for studying protein simplified models, which can be converted into realistic all atom models. The advantage of Constraint Logic Programming over other, much more explored, methodologies, resides in the rapid software prototyping, in the easy way of encoding heuristics, and in exploiting all the advances made in this research area, e.g. in constraint propagation and its use for pruning the huge search space.
Which clustering algorithm is better for predicting protein complexes?

Directory of Open Access Journals (Sweden)

Moschopoulos Charalampos N

2011-12-01

Full Text Available Abstract Background Protein-Protein interactions (PPI play a key role in determining the outcome of most cellular processes. The correct identification and characterization of protein interactions and the networks, which they comprise, is critical for understanding the molecular mechanisms within the cell. Large-scale techniques such as pull down assays and tandem affinity purification are used in order to detect protein interactions in an organism. Today, relatively new high-throughput methods like yeast two hybrid, mass spectrometry, microarrays, and phage display are also used to reveal protein interaction networks. Results In this paper we evaluated four different clustering algorithms using six different interaction datasets. We parameterized the MCL, Spectral, RNSC and Affinity Propagation algorithms and applied them to six PPI datasets produced experimentally by Yeast 2 Hybrid (Y2H and Tandem Affinity Purification (TAP methods. The predicted clusters, so called protein complexes, were then compared and benchmarked with already known complexes stored in published databases. Conclusions While results may differ upon parameterization, the MCL and RNSC algorithms seem to be more promising and more accurate at predicting PPI complexes. Moreover, they predict more complexes than other reviewed algorithms in absolute numbers. On the other hand the spectral clustering algorithm achieves the highest valid prediction rate in our experiments. However, it is nearly always outperformed by both RNSC and MCL in terms of the geometrical accuracy while it generates the fewest valid clusters than any other reviewed algorithm. This article demonstrates various metrics to evaluate the accuracy of such predictions as they are presented in the text below. Supplementary material can be found at: http://www.bioacademy.gr/bioinformatics/projects/ppireview.htm
The 82-plex plasma protein signature that predicts increasing inflammation

DEFF Research Database (Denmark)

Tepel, Martin; Beck, Hans C; Tan, Qihua

2015-01-01

The objective of the study was to define the specific plasma protein signature that predicts the increase of the inflammation marker C-reactive protein from index day to next-day using proteome analysis and novel bioinformatics tools. We performed a prospective study of 91 incident kidney....... The prediction model selected and validated 82 plasma proteins which determined increased next-day C-reactive protein (area under receiver-operator-characteristics curve, 0.772; 95% confidence interval, 0.669 to 0.876; P signature (P ....001) was associated with observed increased next-day C-reactive protein. The 82-plex protein signature outperformed routine clinical procedures. The category-free net reclassification index improved with 82-plex plasma protein signature (total net reclassification index, 88.3%). Using the 82-plex plasma protein...
Protein Secondary Structure Prediction Using Deep Convolutional Neural Fields.

Science.gov (United States)

Wang, Sheng; Peng, Jian; Ma, Jianzhu; Xu, Jinbo

2016-01-11

Protein secondary structure (SS) prediction is important for studying protein structure and function. When only the sequence (profile) information is used as input feature, currently the best predictors can obtain ~80% Q3 accuracy, which has not been improved in the past decade. Here we present DeepCNF (Deep Convolutional Neural Fields) for protein SS prediction. DeepCNF is a Deep Learning extension of Conditional Neural Fields (CNF), which is an integration of Conditional Random Fields (CRF) and shallow neural networks. DeepCNF can model not only complex sequence-structure relationship by a deep hierarchical architecture, but also interdependency between adjacent SS labels, so it is much more powerful than CNF. Experimental results show that DeepCNF can obtain ~84% Q3 accuracy, ~85% SOV score, and ~72% Q8 accuracy, respectively, on the CASP and CAMEO test proteins, greatly outperforming currently popular predictors. As a general framework, DeepCNF can be used to predict other protein structure properties such as contact number, disorder regions, and solvent accessibility.

Combining modularity, conservation, and interactions of proteins significantly increases precision and coverage of protein function prediction

Directory of Open Access Journals (Sweden)

Sers Christine T

2010-12-01

Full Text Available Abstract Background While the number of newly sequenced genomes and genes is constantly increasing, elucidation of their function still is a laborious and time-consuming task. This has led to the development of a wide range of methods for predicting protein functions in silico. We report on a new method that predicts function based on a combination of information about protein interactions, orthology, and the conservation of protein networks in different species. Results We show that aggregation of these independent sources of evidence leads to a drastic increase in number and quality of predictions when compared to baselines and other methods reported in the literature. For instance, our method generates more than 12,000 novel protein functions for human with an estimated precision of ~76%, among which are 7,500 new functional annotations for 1,973 human proteins that previously had zero or only one function annotated. We also verified our predictions on a set of genes that play an important role in colorectal cancer (MLH1, PMS2, EPHB4 and could confirm more than 73% of them based on evidence in the literature. Conclusions The combination of different methods into a single, comprehensive prediction method infers thousands of protein functions for every species included in the analysis at varying, yet always high levels of precision and very good coverage.
Prediction of beta-turns in proteins using the first-order Markov models.

Science.gov (United States)

Lin, Thy-Hou; Wang, Ging-Ming; Wang, Yen-Tseng

2002-01-01

We present a method based on the first-order Markov models for predicting simple beta-turns and loops containing multiple turns in proteins. Sequences of 338 proteins in a database are divided using the published turn criteria into the following three regions, namely, the turn, the boundary, and the nonturn ones. A transition probability matrix is constructed for either the turn or the nonturn region using the weighted transition probabilities computed for dipeptides identified from each region. There are two such matrices constructed for the boundary region since the transition probabilities for dipeptides immediately preceding or following a turn are different. The window used for scanning a protein sequence from amino (N-) to carboxyl (C-) terminal is a hexapeptide since the transition probability computed for a turn tetrapeptide is capped at both the N- and C- termini with a boundary transition probability indexed respectively from the two boundary transition matrices. A sum of the averaged product of the transition probabilities of all the hexapeptides involving each residue is computed. This is then weighted with a probability computed from assuming that all the hexapeptides are from the nonturn region to give the final prediction quantity. Both simple beta-turns and loops containing multiple turns in a protein are then identified by the rising of the prediction quantity computed. The performance of the prediction scheme or the percentage (%) of correct prediction is evaluated through computation of Matthews correlation coefficients for each protein predicted. It is found that the prediction method is capable of giving prediction results with better correlation between the percent of correct prediction and the Matthews correlation coefficients for a group of test proteins as compared with those predicted using some secondary structural prediction methods. The prediction accuracy for about 40% of proteins in the database or 50% of proteins in the test set is
Valor nutricional e energia metabolizável de subprodutos do trigo utilizados para alimentação de suínos em crescimento Nutritional value and metabolizable energy of wheat by‑products used for feeding growing pigs

OpenAIRE

William Rui Wesendonck; Alexandre de Mello Kessler; Andréa Machado Leal Ribeiro; Marcelo Luiz Somensi; Luciane Bockor; Julio Cezar Dadalt; Alessandra Nardina Trícia Rigo Monteiro; Fábio Ritter Marx

2013-01-01

O objetivo deste trabalho foi avaliar o valor nutricional e energético de subprodutos do trigo, em dietas para suínos em crescimento, e obter equações de predição da energia metabolizável. Foram utilizados 36 suínos machos, castrados, alojados em gaiolas metabólicas individuais. Realizou-se a coleta total de fezes e urina em dois períodos de dez dias: cinco para adaptação e cinco para coleta. Utilizou-se o delineamento de blocos ao acaso, tendo-se considerado o período de coleta como bloco, c...
Precalving and early lactation factors that predict milk casein and fertility in the transition dairy cow.

Science.gov (United States)

Rodney, Rachael M; Hall, Jenianne K; Westwood, Charlotte T; Celi, Pietro; Lean, Ian J

2016-09-01

Multiparous Holstein cows (n=82) of either high or low genetic merit (GM) (for milk fat + protein yield) were allocated to 1 of 2 diets in a 2×2 factorial design. Diets differed in the ratio of rumen-undegradable protein (RUP) to rumen-degradable protein (37% RUP vs. 15% RUP) and were fed from 21 d precalving to 150 days in milk. This study evaluated the effects of these diets and GM on concentrations of milk casein (CN) variants and aimed to identify precalving and early lactation variables that predict milk CN and protein yield and composition and fertility of dairy cows. It explored the hypothesis that low milk protein content is associated with lower fertility and extended this hypothesis to also evaluate the association of CN contents with fertility. Yields (kg/d) for CN variants were 0.49 and 0.45 of α-CN, 0.38 and 0.34 of β-CN, 0.07 and 0.06 for κ-CN, and 0.10 and 0.09 of γ-CN for high- and low-RUP diets, respectively. Increased RUP increased milk, CN, and milk protein yields. Increased GM increased milk protein and γ-CN yields and tended to increase milk CN yield. The effects of indicator variables on CN variant yields and concentrations were largely consistent, with higher body weight and α-amino nitrogen resulting in higher yields, but lower concentrations. An increase in cholesterol was associated with decreased CN variant concentrations, and disease lowered CN variant yield. A diet high in RUP increased proportion of first services that resulted in pregnancy from 41 to 58%. Increased precalving metabolizable protein (MP) balance decreased the proportion of first services that resulted in pregnancy when evaluated in a model containing CN percentage, milk protein yield, diet, and GM. This finding suggests that the positive effects of increasing dietary RUP on fertility may be curvilinear because cows with a very positive MP balance before calving were less fertile than those with a lower, but positive, MP balance. Prepartum MP balance was important
Nanoparticles-cell association predicted by protein corona fingerprints

Science.gov (United States)

Palchetti, S.; Digiacomo, L.; Pozzi, D.; Peruzzi, G.; Micarelli, E.; Mahmoudi, M.; Caracciolo, G.

2016-06-01

In a physiological environment (e.g., blood and interstitial fluids) nanoparticles (NPs) will bind proteins shaping a ``protein corona'' layer. The long-lived protein layer tightly bound to the NP surface is referred to as the hard corona (HC) and encodes information that controls NP bioactivity (e.g. cellular association, cellular signaling pathways, biodistribution, and toxicity). Decrypting this complex code has become a priority to predict the NP biological outcomes. Here, we use a library of 16 lipid NPs of varying size (Ø ~ 100-250 nm) and surface chemistry (unmodified and PEGylated) to investigate the relationships between NP physicochemical properties (nanoparticle size, aggregation state and surface charge), protein corona fingerprints (PCFs), and NP-cell association. We found out that none of the NPs' physicochemical properties alone was exclusively able to account for association with human cervical cancer cell line (HeLa). For the entire library of NPs, a total of 436 distinct serum proteins were detected. We developed a predictive-validation modeling that provides a means of assessing the relative significance of the identified corona proteins. Interestingly, a minor fraction of the HC, which consists of only 8 PCFs were identified as main promoters of NP association with HeLa cells. Remarkably, identified PCFs have several receptors with high level of expression on the plasma membrane of HeLa cells.In a physiological environment (e.g., blood and interstitial fluids) nanoparticles (NPs) will bind proteins shaping a ``protein corona'' layer. The long-lived protein layer tightly bound to the NP surface is referred to as the hard corona (HC) and encodes information that controls NP bioactivity (e.g. cellular association, cellular signaling pathways, biodistribution, and toxicity). Decrypting this complex code has become a priority to predict the NP biological outcomes. Here, we use a library of 16 lipid NPs of varying size (Ø ~ 100-250 nm) and surface
PCI-SS: MISO dynamic nonlinear protein secondary structure prediction

Directory of Open Access Journals (Sweden)

Aboul-Magd Mohammed O

2009-07-01

Full Text Available Abstract Background Since the function of a protein is largely dictated by its three dimensional configuration, determining a protein's structure is of fundamental importance to biology. Here we report on a novel approach to determining the one dimensional secondary structure of proteins (distinguishing α-helices, β-strands, and non-regular structures from primary sequence data which makes use of Parallel Cascade Identification (PCI, a powerful technique from the field of nonlinear system identification. Results Using PSI-BLAST divergent evolutionary profiles as input data, dynamic nonlinear systems are built through a black-box approach to model the process of protein folding. Genetic algorithms (GAs are applied in order to optimize the architectural parameters of the PCI models. The three-state prediction problem is broken down into a combination of three binary sub-problems and protein structure classifiers are built using 2 layers of PCI classifiers. Careful construction of the optimization, training, and test datasets ensures that no homology exists between any training and testing data. A detailed comparison between PCI and 9 contemporary methods is provided over a set of 125 new protein chains guaranteed to be dissimilar to all training data. Unlike other secondary structure prediction methods, here a web service is developed to provide both human- and machine-readable interfaces to PCI-based protein secondary structure prediction. This server, called PCI-SS, is available at http://bioinf.sce.carleton.ca/PCISS. In addition to a dynamic PHP-generated web interface for humans, a Simple Object Access Protocol (SOAP interface is added to permit invocation of the PCI-SS service remotely. This machine-readable interface facilitates incorporation of PCI-SS into multi-faceted systems biology analysis pipelines requiring protein secondary structure information, and greatly simplifies high-throughput analyses. XML is used to represent the input
Protein thermostability prediction within homologous families using temperature-dependent statistical potentials.

Directory of Open Access Journals (Sweden)

Fabrizio Pucci

Full Text Available The ability to rationally modify targeted physical and biological features of a protein of interest holds promise in numerous academic and industrial applications and paves the way towards de novo protein design. In particular, bioprocesses that utilize the remarkable properties of enzymes would often benefit from mutants that remain active at temperatures that are either higher or lower than the physiological temperature, while maintaining the biological activity. Many in silico methods have been developed in recent years for predicting the thermodynamic stability of mutant proteins, but very few have focused on thermostability. To bridge this gap, we developed an algorithm for predicting the best descriptor of thermostability, namely the melting temperature Tm, from the protein's sequence and structure. Our method is applicable when the Tm of proteins homologous to the target protein are known. It is based on the design of several temperature-dependent statistical potentials, derived from datasets consisting of either mesostable or thermostable proteins. Linear combinations of these potentials have been shown to yield an estimation of the protein folding free energies at low and high temperatures, and the difference of these energies, a prediction of the melting temperature. This particular construction, that distinguishes between the interactions that contribute more than others to the stability at high temperatures and those that are more stabilizing at low T, gives better performances compared to the standard approach based on T-independent potentials which predict the thermal resistance from the thermodynamic stability. Our method has been tested on 45 proteins of known Tm that belong to 11 homologous families. The standard deviation between experimental and predicted Tm's is equal to 13.6°C in cross validation, and decreases to 8.3°C if the 6 worst predicted proteins are excluded. Possible extensions of our approach are discussed.
Improving prediction of heterodimeric protein complexes using combination with pairwise kernel.

Science.gov (United States)

Ruan, Peiying; Hayashida, Morihiro; Akutsu, Tatsuya; Vert, Jean-Philippe

2018-02-19

Since many proteins become functional only after they interact with their partner proteins and form protein complexes, it is essential to identify the sets of proteins that form complexes. Therefore, several computational methods have been proposed to predict complexes from the topology and structure of experimental protein-protein interaction (PPI) network. These methods work well to predict complexes involving at least three proteins, but generally fail at identifying complexes involving only two different proteins, called heterodimeric complexes or heterodimers. There is however an urgent need for efficient methods to predict heterodimers, since the majority of known protein complexes are precisely heterodimers. In this paper, we use three promising kernel functions, Min kernel and two pairwise kernels, which are Metric Learning Pairwise Kernel (MLPK) and Tensor Product Pairwise Kernel (TPPK). We also consider the normalization forms of Min kernel. Then, we combine Min kernel or its normalization form and one of the pairwise kernels by plugging. We applied kernels based on PPI, domain, phylogenetic profile, and subcellular localization properties to predicting heterodimers. Then, we evaluate our method by employing C-Support Vector Classification (C-SVC), carrying out 10-fold cross-validation, and calculating the average F-measures. The results suggest that the combination of normalized-Min-kernel and MLPK leads to the best F-measure and improved the performance of our previous work, which had been the best existing method so far. We propose new methods to predict heterodimers, using a machine learning-based approach. We train a support vector machine (SVM) to discriminate interacting vs non-interacting protein pairs, based on informations extracted from PPI, domain, phylogenetic profiles and subcellular localization. We evaluate in detail new kernel functions to encode these data, and report prediction performance that outperforms the state-of-the-art.
Protein Function Prediction Based on Sequence and Structure Information

KAUST Repository

Smaili, Fatima Z.

2016-05-25

The number of available protein sequences in public databases is increasing exponentially. However, a significant fraction of these sequences lack functional annotation which is essential to our understanding of how biological systems and processes operate. In this master thesis project, we worked on inferring protein functions based on the primary protein sequence. In the approach we follow, 3D models are first constructed using I-TASSER. Functions are then deduced by structurally matching these predicted models, using global and local similarities, through three independent enzyme commission (EC) and gene ontology (GO) function libraries. The method was tested on 250 “hard” proteins, which lack homologous templates in both structure and function libraries. The results show that this method outperforms the conventional prediction methods based on sequence similarity or threading. Additionally, our method could be improved even further by incorporating protein-protein interaction information. Overall, the method we use provides an efficient approach for automated functional annotation of non-homologous proteins, starting from their sequence.
Exploiting protein flexibility to predict the location of allosteric sites

Directory of Open Access Journals (Sweden)

Panjkovich Alejandro

2012-10-01

Full Text Available Abstract Background Allostery is one of the most powerful and common ways of regulation of protein activity. However, for most allosteric proteins identified to date the mechanistic details of allosteric modulation are not yet well understood. Uncovering common mechanistic patterns underlying allostery would allow not only a better academic understanding of the phenomena, but it would also streamline the design of novel therapeutic solutions. This relatively unexplored therapeutic potential and the putative advantages of allosteric drugs over classical active-site inhibitors fuel the attention allosteric-drug research is receiving at present. A first step to harness the regulatory potential and versatility of allosteric sites, in the context of drug-discovery and design, would be to detect or predict their presence and location. In this article, we describe a simple computational approach, based on the effect allosteric ligands exert on protein flexibility upon binding, to predict the existence and position of allosteric sites on a given protein structure. Results By querying the literature and a recently available database of allosteric sites, we gathered 213 allosteric proteins with structural information that we further filtered into a non-redundant set of 91 proteins. We performed normal-mode analysis and observed significant changes in protein flexibility upon allosteric-ligand binding in 70% of the cases. These results agree with the current view that allosteric mechanisms are in many cases governed by changes in protein dynamics caused by ligand binding. Furthermore, we implemented an approach that achieves 65% positive predictive value in identifying allosteric sites within the set of predicted cavities of a protein (stricter parameters set, 0.22 sensitivity, by combining the current analysis on dynamics with previous results on structural conservation of allosteric sites. We also analyzed four biological examples in detail, revealing
Predicting nucleic acid binding interfaces from structural models of proteins.

Science.gov (United States)

Dror, Iris; Shazman, Shula; Mukherjee, Srayanta; Zhang, Yang; Glaser, Fabian; Mandel-Gutfreund, Yael

2012-02-01

The function of DNA- and RNA-binding proteins can be inferred from the characterization and accurate prediction of their binding interfaces. However, the main pitfall of various structure-based methods for predicting nucleic acid binding function is that they are all limited to a relatively small number of proteins for which high-resolution three-dimensional structures are available. In this study, we developed a pipeline for extracting functional electrostatic patches from surfaces of protein structural models, obtained using the I-TASSER protein structure predictor. The largest positive patches are extracted from the protein surface using the patchfinder algorithm. We show that functional electrostatic patches extracted from an ensemble of structural models highly overlap the patches extracted from high-resolution structures. Furthermore, by testing our pipeline on a set of 55 known nucleic acid binding proteins for which I-TASSER produces high-quality models, we show that the method accurately identifies the nucleic acids binding interface on structural models of proteins. Employing a combined patch approach we show that patches extracted from an ensemble of models better predicts the real nucleic acid binding interfaces compared with patches extracted from independent models. Overall, these results suggest that combining information from a collection of low-resolution structural models could be a valuable approach for functional annotation. We suggest that our method will be further applicable for predicting other functional surfaces of proteins with unknown structure. Copyright © 2011 Wiley Periodicals, Inc.
Protein Secondary Structure Prediction Using AutoEncoder Network and Bayes Classifier

Science.gov (United States)

Wang, Leilei; Cheng, Jinyong

2018-03-01

Protein secondary structure prediction is belong to bioinformatics,and it's important in research area. In this paper, we propose a new prediction way of protein using bayes classifier and autoEncoder network. Our experiments show some algorithms including the construction of the model, the classification of parameters and so on. The data set is a typical CB513 data set for protein. In terms of accuracy, the method is the cross validation based on the 3-fold. Then we can get the Q3 accuracy. Paper results illustrate that the autoencoder network improved the prediction accuracy of protein secondary structure.
PRODIGY : a web server for predicting the binding affinity of protein-protein complexes

NARCIS (Netherlands)

Xue, Li; Garcia Lopes Maia Rodrigues, João; Kastritis, Panagiotis L; Bonvin, Alexandre Mjj; Vangone, Anna

2016-01-01

Gaining insights into the structural determinants of protein-protein interactions holds the key for a deeper understanding of biological functions, diseases and development of therapeutics. An important aspect of this is the ability to accurately predict the binding strength for a given
Prediction of Protein-Protein Interaction By Metasample-Based Sparse Representation

Directory of Open Access Journals (Sweden)

Xiuquan Du

2015-01-01

Full Text Available Protein-protein interactions (PPIs play key roles in many cellular processes such as transcription regulation, cell metabolism, and endocrine function. Understanding these interactions takes a great promotion to the pathogenesis and treatment of various diseases. A large amount of data has been generated by experimental techniques; however, most of these data are usually incomplete or noisy, and the current biological experimental techniques are always very time-consuming and expensive. In this paper, we proposed a novel method (metasample-based sparse representation classification, MSRC for PPIs prediction. A group of metasamples are extracted from the original training samples and then use the l1-regularized least square method to express a new testing sample as the linear combination of these metasamples. PPIs prediction is achieved by using a discrimination function defined in the representation coefficients. The MSRC is applied to PPIs dataset; it achieves 84.9% sensitivity, and 94.55% specificity, which is slightly lower than support vector machine (SVM and much higher than naive Bayes (NB, neural networks (NN, and k-nearest neighbor (KNN. The result shows that the MSRC is efficient for PPIs prediction.
RSARF: Prediction of residue solvent accessibility from protein sequence using random forest method

KAUST Repository

Ganesan, Pugalenthi; Kandaswamy, Krishna Kumar Umar; Chou -, Kuochen; Vivekanandan, Saravanan; Kolatkar, Prasanna R.

2012-01-01

Prediction of protein structure from its amino acid sequence is still a challenging problem. The complete physicochemical understanding of protein folding is essential for the accurate structure prediction. Knowledge of residue solvent accessibility gives useful insights into protein structure prediction and function prediction. In this work, we propose a random forest method, RSARF, to predict residue accessible surface area from protein sequence information. The training and testing was performed using 120 proteins containing 22006 residues. For each residue, buried and exposed state was computed using five thresholds (0%, 5%, 10%, 25%, and 50%). The prediction accuracy for 0%, 5%, 10%, 25%, and 50% thresholds are 72.9%, 78.25%, 78.12%, 77.57% and 72.07% respectively. Further, comparison of RSARF with other methods using a benchmark dataset containing 20 proteins shows that our approach is useful for prediction of residue solvent accessibility from protein sequence without using structural information. The RSARF program, datasets and supplementary data are available at http://caps.ncbs.res.in/download/pugal/RSARF/. - See more at: http://www.eurekaselect.com/89216/article#sthash.pwVGFUjq.dpuf
BetaTPred: prediction of beta-TURNS in a protein using statistical algorithms.

Science.gov (United States)

Kaur, Harpreet; Raghava, G P S

2002-03-01

beta-turns play an important role from a structural and functional point of view. beta-turns are the most common type of non-repetitive structures in proteins and comprise on average, 25% of the residues. In the past numerous methods have been developed to predict beta-turns in a protein. Most of these prediction methods are based on statistical approaches. In order to utilize the full potential of these methods, there is a need to develop a web server. This paper describes a web server called BetaTPred, developed for predicting beta-TURNS in a protein from its amino acid sequence. BetaTPred allows the user to predict turns in a protein using existing statistical algorithms. It also allows to predict different types of beta-TURNS e.g. type I, I', II, II', VI, VIII and non-specific. This server assists the users in predicting the consensus beta-TURNS in a protein. The server is accessible from http://imtech.res.in/raghava/betatpred/
Critical Features of Fragment Libraries for Protein Structure Prediction.

Science.gov (United States)

Trevizani, Raphael; Custódio, Fábio Lima; Dos Santos, Karina Baptista; Dardenne, Laurent Emmanuel

2017-01-01

The use of fragment libraries is a popular approach among protein structure prediction methods and has proven to substantially improve the quality of predicted structures. However, some vital aspects of a fragment library that influence the accuracy of modeling a native structure remain to be determined. This study investigates some of these features. Particularly, we analyze the effect of using secondary structure prediction guiding fragments selection, different fragments sizes and the effect of structural clustering of fragments within libraries. To have a clearer view of how these factors affect protein structure prediction, we isolated the process of model building by fragment assembly from some common limitations associated with prediction methods, e.g., imprecise energy functions and optimization algorithms, by employing an exact structure-based objective function under a greedy algorithm. Our results indicate that shorter fragments reproduce the native structure more accurately than the longer. Libraries composed of multiple fragment lengths generate even better structures, where longer fragments show to be more useful at the beginning of the simulations. The use of many different fragment sizes shows little improvement when compared to predictions carried out with libraries that comprise only three different fragment sizes. Models obtained from libraries built using only sequence similarity are, on average, better than those built with a secondary structure prediction bias. However, we found that the use of secondary structure prediction allows greater reduction of the search space, which is invaluable for prediction methods. The results of this study can be critical guidelines for the use of fragment libraries in protein structure prediction.
Predicting protein complexes from weighted protein-protein interaction graphs with a novel unsupervised methodology: Evolutionary enhanced Markov clustering.

Science.gov (United States)

Theofilatos, Konstantinos; Pavlopoulou, Niki; Papasavvas, Christoforos; Likothanassis, Spiros; Dimitrakopoulos, Christos; Georgopoulos, Efstratios; Moschopoulos, Charalampos; Mavroudi, Seferina

2015-03-01

Proteins are considered to be the most important individual components of biological systems and they combine to form physical protein complexes which are responsible for certain molecular functions. Despite the large availability of protein-protein interaction (PPI) information, not much information is available about protein complexes. Experimental methods are limited in terms of time, efficiency, cost and performance constraints. Existing computational methods have provided encouraging preliminary results, but they phase certain disadvantages as they require parameter tuning, some of them cannot handle weighted PPI data and others do not allow a protein to participate in more than one protein complex. In the present paper, we propose a new fully unsupervised methodology for predicting protein complexes from weighted PPI graphs. The proposed methodology is called evolutionary enhanced Markov clustering (EE-MC) and it is a hybrid combination of an adaptive evolutionary algorithm and a state-of-the-art clustering algorithm named enhanced Markov clustering. EE-MC was compared with state-of-the-art methodologies when applied to datasets from the human and the yeast Saccharomyces cerevisiae organisms. Using public available datasets, EE-MC outperformed existing methodologies (in some datasets the separation metric was increased by 10-20%). Moreover, when applied to new human datasets its performance was encouraging in the prediction of protein complexes which consist of proteins with high functional similarity. In specific, 5737 protein complexes were predicted and 72.58% of them are enriched for at least one gene ontology (GO) function term. EE-MC is by design able to overcome intrinsic limitations of existing methodologies such as their inability to handle weighted PPI networks, their constraint to assign every protein in exactly one cluster and the difficulties they face concerning the parameter tuning. This fact was experimentally validated and moreover, new
Integration of relational and hierarchical network information for protein function prediction

Directory of Open Access Journals (Sweden)

Jiang Xiaoyu

2008-08-01

Full Text Available Abstract Background In the current climate of high-throughput computational biology, the inference of a protein's function from related measurements, such as protein-protein interaction relations, has become a canonical task. Most existing technologies pursue this task as a classification problem, on a term-by-term basis, for each term in a database, such as the Gene Ontology (GO database, a popular rigorous vocabulary for biological functions. However, ontology structures are essentially hierarchies, with certain top to bottom annotation rules which protein function predictions should in principle follow. Currently, the most common approach to imposing these hierarchical constraints on network-based classifiers is through the use of transitive closure to predictions. Results We propose a probabilistic framework to integrate information in relational data, in the form of a protein-protein interaction network, and a hierarchically structured database of terms, in the form of the GO database, for the purpose of protein function prediction. At the heart of our framework is a factorization of local neighborhood information in the protein-protein interaction network across successive ancestral terms in the GO hierarchy. We introduce a classifier within this framework, with computationally efficient implementation, that produces GO-term predictions that naturally obey a hierarchical 'true-path' consistency from root to leaves, without the need for further post-processing. Conclusion A cross-validation study, using data from the yeast Saccharomyces cerevisiae, shows our method offers substantial improvements over both standard 'guilt-by-association' (i.e., Nearest-Neighbor and more refined Markov random field methods, whether in their original form or when post-processed to artificially impose 'true-path' consistency. Further analysis of the results indicates that these improvements are associated with increased predictive capabilities (i.e., increased
Prediction of RNA-Binding Proteins by Voting Systems

Directory of Open Access Journals (Sweden)

C. R. Peng

2011-01-01

Full Text Available It is important to identify which proteins can interact with RNA for the purpose of protein annotation, since interactions between RNA and proteins influence the structure of the ribosome and play important roles in gene expression. This paper tries to identify proteins that can interact with RNA using voting systems. Firstly through Weka, 34 learning algorithms are chosen for investigation. Then simple majority voting system (SMVS is used for the prediction of RNA-binding proteins, achieving average ACC (overall prediction accuracy value of 79.72% and MCC (Matthew’s correlation coefficient value of 59.77% for the independent testing dataset. Then mRMR (minimum redundancy maximum relevance strategy is used, which is transferred into algorithm selection. In addition, the MCC value of each classifier is assigned to be the weight of the classifier’s vote. As a result, best average MCC values are attained when 22 algorithms are selected and integrated through weighted votes, which are 64.70% for the independent testing dataset, and ACC value is 82.04% at this moment.

Fast dynamics perturbation analysis for prediction of protein functional sites

Directory of Open Access Journals (Sweden)

Cohn Judith D

2008-01-01

Full Text Available Abstract Background We present a fast version of the dynamics perturbation analysis (DPA algorithm to predict functional sites in protein structures. The original DPA algorithm finds regions in proteins where interactions cause a large change in the protein conformational distribution, as measured using the relative entropy Dx. Such regions are associated with functional sites. Results The Fast DPA algorithm, which accelerates DPA calculations, is motivated by an empirical observation that Dx in a normal-modes model is highly correlated with an entropic term that only depends on the eigenvalues of the normal modes. The eigenvalues are accurately estimated using first-order perturbation theory, resulting in a N-fold reduction in the overall computational requirements of the algorithm, where N is the number of residues in the protein. The performance of the original and Fast DPA algorithms was compared using protein structures from a standard small-molecule docking test set. For nominal implementations of each algorithm, top-ranked Fast DPA predictions overlapped the true binding site 94% of the time, compared to 87% of the time for original DPA. In addition, per-protein recall statistics (fraction of binding-site residues that are among predicted residues were slightly better for Fast DPA. On the other hand, per-protein precision statistics (fraction of predicted residues that are among binding-site residues were slightly better using original DPA. Overall, the performance of Fast DPA in predicting ligand-binding-site residues was comparable to that of the original DPA algorithm. Conclusion Compared to the original DPA algorithm, the decreased run time with comparable performance makes Fast DPA well-suited for implementation on a web server and for high-throughput analysis.
BIPS: BIANA Interolog Prediction Server. A tool for protein-protein interaction inference.

Science.gov (United States)

Garcia-Garcia, Javier; Schleker, Sylvia; Klein-Seetharaman, Judith; Oliva, Baldo

2012-07-01

Protein-protein interactions (PPIs) play a crucial role in biology, and high-throughput experiments have greatly increased the coverage of known interactions. Still, identification of complete inter- and intraspecies interactomes is far from being complete. Experimental data can be complemented by the prediction of PPIs within an organism or between two organisms based on the known interactions of the orthologous genes of other organisms (interologs). Here, we present the BIANA (Biologic Interactions and Network Analysis) Interolog Prediction Server (BIPS), which offers a web-based interface to facilitate PPI predictions based on interolog information. BIPS benefits from the capabilities of the framework BIANA to integrate the several PPI-related databases. Additional metadata can be used to improve the reliability of the predicted interactions. Sensitivity and specificity of the server have been calculated using known PPIs from different interactomes using a leave-one-out approach. The specificity is between 72 and 98%, whereas sensitivity varies between 1 and 59%, depending on the sequence identity cut-off used to calculate similarities between sequences. BIPS is freely accessible at http://sbi.imim.es/BIPS.php.
Integrative approaches to the prediction of protein functions based on the feature selection

Directory of Open Access Journals (Sweden)

Lee Hyunju

2009-12-01

Full Text Available Abstract Background Protein function prediction has been one of the most important issues in functional genomics. With the current availability of various genomic data sets, many researchers have attempted to develop integration models that combine all available genomic data for protein function prediction. These efforts have resulted in the improvement of prediction quality and the extension of prediction coverage. However, it has also been observed that integrating more data sources does not always increase the prediction quality. Therefore, selecting data sources that highly contribute to the protein function prediction has become an important issue. Results We present systematic feature selection methods that assess the contribution of genome-wide data sets to predict protein functions and then investigate the relationship between genomic data sources and protein functions. In this study, we use ten different genomic data sources in Mus musculus, including: protein-domains, protein-protein interactions, gene expressions, phenotype ontology, phylogenetic profiles and disease data sources to predict protein functions that are labelled with Gene Ontology (GO terms. We then apply two approaches to feature selection: exhaustive search feature selection using a kernel based logistic regression (KLR, and a kernel based L1-norm regularized logistic regression (KL1LR. In the first approach, we exhaustively measure the contribution of each data set for each function based on its prediction quality. In the second approach, we use the estimated coefficients of features as measures of contribution of data sources. Our results show that the proposed methods improve the prediction quality compared to the full integration of all data sources and other filter-based feature selection methods. We also show that contributing data sources can differ depending on the protein function. Furthermore, we observe that highly contributing data sets can be similar among
A probabilistic fragment-based protein structure prediction algorithm.

Directory of Open Access Journals (Sweden)

David Simoncini

Full Text Available Conformational sampling is one of the bottlenecks in fragment-based protein structure prediction approaches. They generally start with a coarse-grained optimization where mainchain atoms and centroids of side chains are considered, followed by a fine-grained optimization with an all-atom representation of proteins. It is during this coarse-grained phase that fragment-based methods sample intensely the conformational space. If the native-like region is sampled more, the accuracy of the final all-atom predictions may be improved accordingly. In this work we present EdaFold, a new method for fragment-based protein structure prediction based on an Estimation of Distribution Algorithm. Fragment-based approaches build protein models by assembling short fragments from known protein structures. Whereas the probability mass functions over the fragment libraries are uniform in the usual case, we propose an algorithm that learns from previously generated decoys and steers the search toward native-like regions. A comparison with Rosetta AbInitio protocol shows that EdaFold is able to generate models with lower energies and to enhance the percentage of near-native coarse-grained decoys on a benchmark of [Formula: see text] proteins. The best coarse-grained models produced by both methods were refined into all-atom models and used in molecular replacement. All atom decoys produced out of EdaFold's decoy set reach high enough accuracy to solve the crystallographic phase problem by molecular replacement for some test proteins. EdaFold showed a higher success rate in molecular replacement when compared to Rosetta. Our study suggests that improving low resolution coarse-grained decoys allows computational methods to avoid subsequent sampling issues during all-atom refinement and to produce better all-atom models. EdaFold can be downloaded from http://www.riken.jp/zhangiru/software.html [corrected].
G-LoSA for Prediction of Protein-Ligand Binding Sites and Structures.

Science.gov (United States)

Lee, Hui Sun; Im, Wonpil

2017-01-01

Recent advances in high-throughput structure determination and computational protein structure prediction have significantly enriched the universe of protein structure. However, there is still a large gap between the number of available protein structures and that of proteins with annotated function in high accuracy. Computational structure-based protein function prediction has emerged to reduce this knowledge gap. The identification of a ligand binding site and its structure is critical to the determination of a protein's molecular function. We present a computational methodology for predicting small molecule ligand binding site and ligand structure using G-LoSA, our protein local structure alignment and similarity measurement tool. All the computational procedures described here can be easily implemented using G-LoSA Toolkit, a package of standalone software programs and preprocessed PDB structure libraries. G-LoSA and G-LoSA Toolkit are freely available to academic users at http://compbio.lehigh.edu/GLoSA . We also illustrate a case study to show the potential of our template-based approach harnessing G-LoSA for protein function prediction.
Age at puberty, ovulation rate, and uterine length of developing gilts fed two lysine and three metabolizable energy concentrations from 100 to 260 d of age

Science.gov (United States)

The objectives of this study were to determine the effect of feeding different lysine and metabolizable energy (ME) levels to developing gilts on age at puberty and reproductive tract measurements, and to determine relationships between these traits and growth trajectories. Crossbred Large White × L...
Predicting Protein Function via Semantic Integration of Multiple Networks.

Science.gov (United States)

Yu, Guoxian; Fu, Guangyuan; Wang, Jun; Zhu, Hailong

2016-01-01

Determining the biological functions of proteins is one of the key challenges in the post-genomic era. The rapidly accumulated large volumes of proteomic and genomic data drives to develop computational models for automatically predicting protein function in large scale. Recent approaches focus on integrating multiple heterogeneous data sources and they often get better results than methods that use single data source alone. In this paper, we investigate how to integrate multiple biological data sources with the biological knowledge, i.e., Gene Ontology (GO), for protein function prediction. We propose a method, called SimNet, to Semantically integrate multiple functional association Networks derived from heterogenous data sources. SimNet firstly utilizes GO annotations of proteins to capture the semantic similarity between proteins and introduces a semantic kernel based on the similarity. Next, SimNet constructs a composite network, obtained as a weighted summation of individual networks, and aligns the network with the kernel to get the weights assigned to individual networks. Then, it applies a network-based classifier on the composite network to predict protein function. Experiment results on heterogenous proteomic data sources of Yeast, Human, Mouse, and Fly show that, SimNet not only achieves better (or comparable) results than other related competitive approaches, but also takes much less time. The Matlab codes of SimNet are available at https://sites.google.com/site/guoxian85/simnet.
Validation of Molecular Dynamics Simulations for Prediction of Three-Dimensional Structures of Small Proteins.

Science.gov (United States)

Kato, Koichi; Nakayoshi, Tomoki; Fukuyoshi, Shuichi; Kurimoto, Eiji; Oda, Akifumi

2017-10-12

Although various higher-order protein structure prediction methods have been developed, almost all of them were developed based on the three-dimensional (3D) structure information of known proteins. Here we predicted the short protein structures by molecular dynamics (MD) simulations in which only Newton's equations of motion were used and 3D structural information of known proteins was not required. To evaluate the ability of MD simulationto predict protein structures, we calculated seven short test protein (10-46 residues) in the denatured state and compared their predicted and experimental structures. The predicted structure for Trp-cage (20 residues) was close to the experimental structure by 200-ns MD simulation. For proteins shorter or longer than Trp-cage, root-mean square deviation values were larger than those for Trp-cage. However, secondary structures could be reproduced by MD simulations for proteins with 10-34 residues. Simulations by replica exchange MD were performed, but the results were similar to those from normal MD simulations. These results suggest that normal MD simulations can roughly predict short protein structures and 200-ns simulations are frequently sufficient for estimating the secondary structures of protein (approximately 20 residues). Structural prediction method using only fundamental physical laws are useful for investigating non-natural proteins, such as primitive proteins and artificial proteins for peptide-based drug delivery systems.
Feature-Based and String-Based Models for Predicting RNA-Protein Interaction

Directory of Open Access Journals (Sweden)

Donald Adjeroh

2018-03-01

Full Text Available In this work, we study two approaches for the problem of RNA-Protein Interaction (RPI. In the first approach, we use a feature-based technique by combining extracted features from both sequences and secondary structures. The feature-based approach enhanced the prediction accuracy as it included much more available information about the RNA-protein pairs. In the second approach, we apply search algorithms and data structures to extract effective string patterns for prediction of RPI, using both sequence information (protein and RNA sequences, and structure information (protein and RNA secondary structures. This led to different string-based models for predicting interacting RNA-protein pairs. We show results that demonstrate the effectiveness of the proposed approaches, including comparative results against leading state-of-the-art methods.
Oxidação lipídica do óleo de vísceras de aves para redução de seu conteúdo de energia metabolizável para frangos de corte na fase de crescimento Lipid oxidation decreases metabolizable energy value of dietary poultry fat for growing broilers

Directory of Open Access Journals (Sweden)

Aline Mondini Calil Racanicci

2004-08-01

Full Text Available Com a finalidade de determinar os teores de energia metabolizável aparente (EMA e de energia metabolizável aparente corrigida para o nitrogênio (EMAn do óleo de vísceras de aves fresco e oxidado, foi conduzido um ensaio metabólico utilizando-se 48 machos AgRoss com 31 dias de idade. As aves foram alojadas em gaiolas metabólicas e o método utilizado foi o de coleta total de excretas. Foi fornecida uma dieta-referência com ou sem substituição de 10% pelo óleo de vísceras de aves fresco ou oxidado, sendo que cada dieta foi oferecida a quatro repetições de quatro aves. O período de coleta foi de quatro dias após três dias de adaptação às dietas e às gaiolas. O óleo de vísceras de aves foi adquirido de um produtor local e armazenado sob congelamento a -18ºC (óleo fresco. O óleo oxidado foi obtido por aquecimento em banho-maria com temperatura entre 80 e 90ºC. Durante o período de aquecimento, a qualidade deste óleo foi controlada por avaliações periódicas da absorbância específica, que indica o acúmulo de compostos de ranço. Os valores de absorbância específica, medidos a 232 e 270 nm, foram, respectivamente, de 4,64 e 0,47 para o óleo fresco e de 18,54 e 3,76 para o óleo oxidado. Os resultados obtidos, expressos na matéria original, para EMA e EMAn foram de 9.240 e 9.150 kcal/kg para o óleo de vísceras fresco e 7.700 e 7.595 kcal/kg para o óleo oxidado, comprovando estatisticamente grande redução no conteúdo de energia metabolizável do óleo decorrente do processo oxidativo.In order to determine the apparent metabolizable energy (AME and N-corrected apparent metabolizable energy (AMEn of fresh and oxidized poultry fat a metabolism assay with 48 AgRoss male broilers from 31 to 34 days of age was conducted. The birds were fed a basal diet or this diet replaced by 10% of fresh or oxidized fat and the total excreta collection method was applied. The birds were housed in metabolic cages and each diet was
Extraction of Protein-Protein Interaction from Scientific Articles by Predicting Dominant Keywords.

Science.gov (United States)

Koyabu, Shun; Phan, Thi Thanh Thuy; Ohkawa, Takenao

2015-01-01

For the automatic extraction of protein-protein interaction information from scientific articles, a machine learning approach is useful. The classifier is generated from training data represented using several features to decide whether a protein pair in each sentence has an interaction. Such a specific keyword that is directly related to interaction as "bind" or "interact" plays an important role for training classifiers. We call it a dominant keyword that affects the capability of the classifier. Although it is important to identify the dominant keywords, whether a keyword is dominant depends on the context in which it occurs. Therefore, we propose a method for predicting whether a keyword is dominant for each instance. In this method, a keyword that derives imbalanced classification results is tentatively assumed to be a dominant keyword initially. Then the classifiers are separately trained from the instance with and without the assumed dominant keywords. The validity of the assumed dominant keyword is evaluated based on the classification results of the generated classifiers. The assumption is updated by the evaluation result. Repeating this process increases the prediction accuracy of the dominant keyword. Our experimental results using five corpora show the effectiveness of our proposed method with dominant keyword prediction.
Evaluation of multiple protein docking structures using correctly predicted pairwise subunits

Directory of Open Access Journals (Sweden)

Esquivel-Rodríguez Juan

2012-03-01

Full Text Available Abstract Background Many functionally important proteins in a cell form complexes with multiple chains. Therefore, computational prediction of multiple protein complexes is an important task in bioinformatics. In the development of multiple protein docking methods, it is important to establish a metric for evaluating prediction results in a reasonable and practical fashion. However, since there are only few works done in developing methods for multiple protein docking, there is no study that investigates how accurate structural models of multiple protein complexes should be to allow scientists to gain biological insights. Methods We generated a series of predicted models (decoys of various accuracies by our multiple protein docking pipeline, Multi-LZerD, for three multi-chain complexes with 3, 4, and 6 chains. We analyzed the decoys in terms of the number of correctly predicted pair conformations in the decoys. Results and conclusion We found that pairs of chains with the correct mutual orientation exist even in the decoys with a large overall root mean square deviation (RMSD to the native. Therefore, in addition to a global structure similarity measure, such as the global RMSD, the quality of models for multiple chain complexes can be better evaluated by using the local measurement, the number of chain pairs with correct mutual orientation. We termed the fraction of correctly predicted pairs (RMSD at the interface of less than 4.0Å as fpair and propose to use it for evaluation of the accuracy of multiple protein docking.
Fast computational methods for predicting protein structure from primary amino acid sequence

Science.gov (United States)

Agarwal, Pratul Kumar [Knoxville, TN

2011-07-19

The present invention provides a method utilizing primary amino acid sequence of a protein, energy minimization, molecular dynamics and protein vibrational modes to predict three-dimensional structure of a protein. The present invention also determines possible intermediates in the protein folding pathway. The present invention has important applications to the design of novel drugs as well as protein engineering. The present invention predicts the three-dimensional structure of a protein independent of size of the protein, overcoming a significant limitation in the prior art.
Predicting Secretory Proteins with SignalP

DEFF Research Database (Denmark)

Nielsen, Henrik

2017-01-01

SignalP is the currently most widely used program for prediction of signal peptides from amino acid sequences. Proteins with signal peptides are targeted to the secretory pathway, but are not necessarily secreted. After a brief introduction to the biology of signal peptides and the history...
Protein (multi-)location prediction: using location inter-dependencies in a probabilistic framework

Science.gov (United States)

2014-01-01

Motivation Knowing the location of a protein within the cell is important for understanding its function, role in biological processes, and potential use as a drug target. Much progress has been made in developing computational methods that predict single locations for proteins. Most such methods are based on the over-simplifying assumption that proteins localize to a single location. However, it has been shown that proteins localize to multiple locations. While a few recent systems attempt to predict multiple locations of proteins, their performance leaves much room for improvement. Moreover, they typically treat locations as independent and do not attempt to utilize possible inter-dependencies among locations. Our hypothesis is that directly incorporating inter-dependencies among locations into both the classifier-learning and the prediction process can improve location prediction performance. Results We present a new method and a preliminary system we have developed that directly incorporates inter-dependencies among locations into the location-prediction process of multiply-localized proteins. Our method is based on a collection of Bayesian network classifiers, where each classifier is used to predict a single location. Learning the structure of each Bayesian network classifier takes into account inter-dependencies among locations, and the prediction process uses estimates involving multiple locations. We evaluate our system on a dataset of single- and multi-localized proteins (the most comprehensive protein multi-localization dataset currently available, derived from the DBMLoc dataset). Our results, obtained by incorporating inter-dependencies, are significantly higher than those obtained by classifiers that do not use inter-dependencies. The performance of our system on multi-localized proteins is comparable to a top performing system (YLoc+), without being restricted only to location-combinations present in the training set. PMID:24646119
Protein (multi-)location prediction: using location inter-dependencies in a probabilistic framework.

Science.gov (United States)

Simha, Ramanuja; Shatkay, Hagit

2014-03-19

Knowing the location of a protein within the cell is important for understanding its function, role in biological processes, and potential use as a drug target. Much progress has been made in developing computational methods that predict single locations for proteins. Most such methods are based on the over-simplifying assumption that proteins localize to a single location. However, it has been shown that proteins localize to multiple locations. While a few recent systems attempt to predict multiple locations of proteins, their performance leaves much room for improvement. Moreover, they typically treat locations as independent and do not attempt to utilize possible inter-dependencies among locations. Our hypothesis is that directly incorporating inter-dependencies among locations into both the classifier-learning and the prediction process can improve location prediction performance. We present a new method and a preliminary system we have developed that directly incorporates inter-dependencies among locations into the location-prediction process of multiply-localized proteins. Our method is based on a collection of Bayesian network classifiers, where each classifier is used to predict a single location. Learning the structure of each Bayesian network classifier takes into account inter-dependencies among locations, and the prediction process uses estimates involving multiple locations. We evaluate our system on a dataset of single- and multi-localized proteins (the most comprehensive protein multi-localization dataset currently available, derived from the DBMLoc dataset). Our results, obtained by incorporating inter-dependencies, are significantly higher than those obtained by classifiers that do not use inter-dependencies. The performance of our system on multi-localized proteins is comparable to a top performing system (YLoc+), without being restricted only to location-combinations present in the training set.
Adaptive compressive learning for prediction of protein-protein interactions from primary sequence.

Science.gov (United States)

Zhang, Ya-Nan; Pan, Xiao-Yong; Huang, Yan; Shen, Hong-Bin

2011-08-21

Protein-protein interactions (PPIs) play an important role in biological processes. Although much effort has been devoted to the identification of novel PPIs by integrating experimental biological knowledge, there are still many difficulties because of lacking enough protein structural and functional information. It is highly desired to develop methods based only on amino acid sequences for predicting PPIs. However, sequence-based predictors are often struggling with the high-dimensionality causing over-fitting and high computational complexity problems, as well as the redundancy of sequential feature vectors. In this paper, a novel computational approach based on compressed sensing theory is proposed to predict yeast Saccharomyces cerevisiae PPIs from primary sequence and has achieved promising results. The key advantage of the proposed compressed sensing algorithm is that it can compress the original high-dimensional protein sequential feature vector into a much lower but more condensed space taking the sparsity property of the original signal into account. What makes compressed sensing much more attractive in protein sequence analysis is its compressed signal can be reconstructed from far fewer measurements than what is usually considered necessary in traditional Nyquist sampling theory. Experimental results demonstrate that proposed compressed sensing method is powerful for analyzing noisy biological data and reducing redundancy in feature vectors. The proposed method represents a new strategy of dealing with high-dimensional protein discrete model and has great potentiality to be extended to deal with many other complicated biological systems. Copyright © 2011 Elsevier Ltd. All rights reserved.
CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction

KAUST Repository

Cui, Xuefeng; Lu, Zhiwu; Wang, Sheng; Jing-Yan Wang, Jim; Gao, Xin

2016-01-01

Motivation: Protein homology detection, a fundamental problem in computational biology, is an indispensable step toward predicting protein structures and understanding protein functions. Despite the advances in recent decades on sequence alignment
Hill-Climbing search and diversification within an evolutionary approach to protein structure prediction.

Science.gov (United States)

Chira, Camelia; Horvath, Dragos; Dumitrescu, D

2011-07-30

Proteins are complex structures made of amino acids having a fundamental role in the correct functioning of living cells. The structure of a protein is the result of the protein folding process. However, the general principles that govern the folding of natural proteins into a native structure are unknown. The problem of predicting a protein structure with minimum-energy starting from the unfolded amino acid sequence is a highly complex and important task in molecular and computational biology. Protein structure prediction has important applications in fields such as drug design and disease prediction. The protein structure prediction problem is NP-hard even in simplified lattice protein models. An evolutionary model based on hill-climbing genetic operators is proposed for protein structure prediction in the hydrophobic - polar (HP) model. Problem-specific search operators are implemented and applied using a steepest-ascent hill-climbing approach. Furthermore, the proposed model enforces an explicit diversification stage during the evolution in order to avoid local optimum. The main features of the resulting evolutionary algorithm - hill-climbing mechanism and diversification strategy - are evaluated in a set of numerical experiments for the protein structure prediction problem to assess their impact to the efficiency of the search process. Furthermore, the emerging consolidated model is compared to relevant algorithms from the literature for a set of difficult bidimensional instances from lattice protein models. The results obtained by the proposed algorithm are promising and competitive with those of related methods.
Hill-Climbing search and diversification within an evolutionary approach to protein structure prediction

Directory of Open Access Journals (Sweden)

Chira Camelia

2011-07-01

Full Text Available Abstract Proteins are complex structures made of amino acids having a fundamental role in the correct functioning of living cells. The structure of a protein is the result of the protein folding process. However, the general principles that govern the folding of natural proteins into a native structure are unknown. The problem of predicting a protein structure with minimum-energy starting from the unfolded amino acid sequence is a highly complex and important task in molecular and computational biology. Protein structure prediction has important applications in fields such as drug design and disease prediction. The protein structure prediction problem is NP-hard even in simplified lattice protein models. An evolutionary model based on hill-climbing genetic operators is proposed for protein structure prediction in the hydrophobic - polar (HP model. Problem-specific search operators are implemented and applied using a steepest-ascent hill-climbing approach. Furthermore, the proposed model enforces an explicit diversification stage during the evolution in order to avoid local optimum. The main features of the resulting evolutionary algorithm - hill-climbing mechanism and diversification strategy - are evaluated in a set of numerical experiments for the protein structure prediction problem to assess their impact to the efficiency of the search process. Furthermore, the emerging consolidated model is compared to relevant algorithms from the literature for a set of difficult bidimensional instances from lattice protein models. The results obtained by the proposed algorithm are promising and competitive with those of related methods.

An update of the DEF database of protein fold class predictions

DEFF Research Database (Denmark)

Reczko, Martin; Karras, Dimitris; Bohr, Henrik

1997-01-01

An update is given on the Database of Expected Fold classes (DEF) that contains a collection of fold-class predictions made from protein sequences and a mail server that provides new predictions for new sequences. To any given sequence one of 49 fold-classes is chosen to classify the structure re...... related to the sequence with high accuracy. The updated predictions system is developed using data from the new version of the 3D-ALI database of aligned protein structures and thus is giving more reliable and more detailed predictions than the previous DEF system.......An update is given on the Database of Expected Fold classes (DEF) that contains a collection of fold-class predictions made from protein sequences and a mail server that provides new predictions for new sequences. To any given sequence one of 49 fold-classes is chosen to classify the structure...
Minimum curvilinearity to enhance topological prediction of protein interactions by network embedding

KAUST Repository

Cannistraci, Carlo

2013-06-21

Motivation: Most functions within the cell emerge thanks to protein-protein interactions (PPIs), yet experimental determination of PPIs is both expensive and time-consuming. PPI networks present significant levels of noise and incompleteness. Predicting interactions using only PPI-network topology (topological prediction) is difficult but essential when prior biological knowledge is absent or unreliable.Methods: Network embedding emphasizes the relations between network proteins embedded in a low-dimensional space, in which protein pairs that are closer to each other represent good candidate interactions. To achieve network denoising, which boosts prediction performance, we first applied minimum curvilinear embedding (MCE), and then adopted shortest path (SP) in the reduced space to assign likelihood scores to candidate interactions. Furthermore, we introduce (i) a new valid variation of MCE, named non-centred MCE (ncMCE); (ii) two automatic strategies for selecting the appropriate embedding dimension; and (iii) two new randomized procedures for evaluating predictions.Results: We compared our method against several unsupervised and supervisedly tuned embedding approaches and node neighbourhood techniques. Despite its computational simplicity, ncMCE-SP was the overall leader, outperforming the current methods in topological link prediction.Conclusion: Minimum curvilinearity is a valuable non-linear framework that we successfully applied to the embedding of protein networks for the unsupervised prediction of novel PPIs. The rationale for our approach is that biological and evolutionary information is imprinted in the non-linear patterns hidden behind the protein network topology, and can be exploited for predicting new protein links. The predicted PPIs represent good candidates for testing in high-throughput experiments or for exploitation in systems biology tools such as those used for network-based inference and prediction of disease-related functional modules. The
Predicting the subcellular localization of viral proteins within a mammalian host cell

Directory of Open Access Journals (Sweden)

Thomas DY

2006-04-01

Full Text Available Abstract Background The bioinformatic prediction of protein subcellular localization has been extensively studied for prokaryotic and eukaryotic organisms. However, this is not the case for viruses whose proteins are often involved in extensive interactions at various subcellular localizations with host proteins. Results Here, we investigate the extent of utilization of human cellular localization mechanisms by viral proteins and we demonstrate that appropriate eukaryotic subcellular localization predictors can be used to predict viral protein localization within the host cell. Conclusion Such predictions provide a method to rapidly annotate viral proteomes with subcellular localization information. They are likely to have widespread applications both in the study of the functions of viral proteins in the host cell and in the design of antiviral drugs.
A Particle Swarm Optimization-Based Approach with Local Search for Predicting Protein Folding.

Science.gov (United States)

Yang, Cheng-Hong; Lin, Yu-Shiun; Chuang, Li-Yeh; Chang, Hsueh-Wei

2017-10-01

The hydrophobic-polar (HP) model is commonly used for predicting protein folding structures and hydrophobic interactions. This study developed a particle swarm optimization (PSO)-based algorithm combined with local search algorithms; specifically, the high exploration PSO (HEPSO) algorithm (which can execute global search processes) was combined with three local search algorithms (hill-climbing algorithm, greedy algorithm, and Tabu table), yielding the proposed HE-L-PSO algorithm. By using 20 known protein structures, we evaluated the performance of the HE-L-PSO algorithm in predicting protein folding in the HP model. The proposed HE-L-PSO algorithm exhibited favorable performance in predicting both short and long amino acid sequences with high reproducibility and stability, compared with seven reported algorithms. The HE-L-PSO algorithm yielded optimal solutions for all predicted protein folding structures. All HE-L-PSO-predicted protein folding structures possessed a hydrophobic core that is similar to normal protein folding.
Predicting Protein-Protein Interaction Sites with a Novel Membership Based Fuzzy SVM Classifier.

Science.gov (United States)

Sriwastava, Brijesh K; Basu, Subhadip; Maulik, Ujjwal

2015-01-01

Predicting residues that participate in protein-protein interactions (PPI) helps to identify, which amino acids are located at the interface. In this paper, we show that the performance of the classical support vector machine (SVM) algorithm can further be improved with the use of a custom-designed fuzzy membership function, for the partner-specific PPI interface prediction problem. We evaluated the performances of both classical SVM and fuzzy SVM (F-SVM) on the PPI databases of three different model proteomes of Homo sapiens, Escherichia coli and Saccharomyces Cerevisiae and calculated the statistical significance of the developed F-SVM over classical SVM algorithm. We also compared our performance with the available state-of-the-art fuzzy methods in this domain and observed significant performance improvements. To predict interaction sites in protein complexes, local composition of amino acids together with their physico-chemical characteristics are used, where the F-SVM based prediction method exploits the membership function for each pair of sequence fragments. The average F-SVM performance (area under ROC curve) on the test samples in 10-fold cross validation experiment are measured as 77.07, 78.39, and 74.91 percent for the aforementioned organisms respectively. Performances on independent test sets are obtained as 72.09, 73.24 and 82.74 percent respectively. The software is available for free download from http://code.google.com/p/cmater-bioinfo.
Building a better fragment library for de novo protein structure prediction.

Directory of Open Access Journals (Sweden)

Saulo H P de Oliveira

Full Text Available Fragment-based approaches are the current standard for de novo protein structure prediction. These approaches rely on accurate and reliable fragment libraries to generate good structural models. In this work, we describe a novel method for structure fragment library generation and its application in fragment-based de novo protein structure prediction. The importance of correct testing procedures in assessing the quality of fragment libraries is demonstrated. In particular, the exclusion of homologs to the target from the libraries to correctly simulate a de novo protein structure prediction scenario, something which surprisingly is not always done. We demonstrate that fragments presenting different predominant predicted secondary structures should be treated differently during the fragment library generation step and that exhaustive and random search strategies should both be used. This information was used to develop a novel method, Flib. On a validation set of 41 structurally diverse proteins, Flib libraries presents both a higher precision and coverage than two of the state-of-the-art methods, NNMake and HHFrag. Flib also achieves better precision and coverage on the set of 275 protein domains used in the two previous experiments of the the Critical Assessment of Structure Prediction (CASP9 and CASP10. We compared Flib libraries against NNMake libraries in a structure prediction context. Of the 13 cases in which a correct answer was generated, Flib models were more accurate than NNMake models for 10. "Flib is available for download at: http://www.stats.ox.ac.uk/research/proteins/resources".
Building a Better Fragment Library for De Novo Protein Structure Prediction

Science.gov (United States)

de Oliveira, Saulo H. P.; Shi, Jiye; Deane, Charlotte M.

2015-01-01

Fragment-based approaches are the current standard for de novo protein structure prediction. These approaches rely on accurate and reliable fragment libraries to generate good structural models. In this work, we describe a novel method for structure fragment library generation and its application in fragment-based de novo protein structure prediction. The importance of correct testing procedures in assessing the quality of fragment libraries is demonstrated. In particular, the exclusion of homologs to the target from the libraries to correctly simulate a de novo protein structure prediction scenario, something which surprisingly is not always done. We demonstrate that fragments presenting different predominant predicted secondary structures should be treated differently during the fragment library generation step and that exhaustive and random search strategies should both be used. This information was used to develop a novel method, Flib. On a validation set of 41 structurally diverse proteins, Flib libraries presents both a higher precision and coverage than two of the state-of-the-art methods, NNMake and HHFrag. Flib also achieves better precision and coverage on the set of 275 protein domains used in the two previous experiments of the the Critical Assessment of Structure Prediction (CASP9 and CASP10). We compared Flib libraries against NNMake libraries in a structure prediction context. Of the 13 cases in which a correct answer was generated, Flib models were more accurate than NNMake models for 10. “Flib is available for download at: http://www.stats.ox.ac.uk/research/proteins/resources”. PMID:25901595
InterMap3D: predicting and visualizing co-evolving protein residues

DEFF Research Database (Denmark)

Oliveira, Rodrigo Gouveia; Roque, francisco jose sousa simôes almeida; Wernersson, Rasmus

2009-01-01

InterMap3D predicts co-evolving protein residues and plots them on the 3D protein structure. Starting with a single protein sequence, InterMap3D automatically finds a set of homologous sequences, generates an alignment and fetches the most similar 3D structure from the Protein Data Bank (PDB......). It can also accept a user-generated alignment. Based on the alignment, co-evolving residues are then predicted using three different methods: Row and Column Weighing of Mutual Information, Mutual Information/Entropy and Dependency. Finally, InterMap3D generates high-quality images of the protein...
Binding Ligand Prediction for Proteins Using Partial Matching of Local Surface Patches

Directory of Open Access Journals (Sweden)

Lee Sael

2010-12-01

Full Text Available Functional elucidation of uncharacterized protein structures is an important task in bioinformatics. We report our new approach for structure-based function prediction which captures local surface features of ligand binding pockets. Function of proteins, specifically, binding ligands of proteins, can be predicted by finding similar local surface regions of known proteins. To enable partial comparison of binding sites in proteins, a weighted bipartite matching algorithm is used to match pairs of surface patches. The surface patches are encoded with the 3D Zernike descriptors. Unlike the existing methods which compare global characteristics of the protein fold or the global pocket shape, the local surface patch method can find functional similarity between non-homologous proteins and binding pockets for flexible ligand molecules. The proposed method improves prediction results over global pocket shape-based method which was previously developed by our group.
Binding ligand prediction for proteins using partial matching of local surface patches.

Science.gov (United States)

Sael, Lee; Kihara, Daisuke

2010-01-01

Functional elucidation of uncharacterized protein structures is an important task in bioinformatics. We report our new approach for structure-based function prediction which captures local surface features of ligand binding pockets. Function of proteins, specifically, binding ligands of proteins, can be predicted by finding similar local surface regions of known proteins. To enable partial comparison of binding sites in proteins, a weighted bipartite matching algorithm is used to match pairs of surface patches. The surface patches are encoded with the 3D Zernike descriptors. Unlike the existing methods which compare global characteristics of the protein fold or the global pocket shape, the local surface patch method can find functional similarity between non-homologous proteins and binding pockets for flexible ligand molecules. The proposed method improves prediction results over global pocket shape-based method which was previously developed by our group.
CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction

KAUST Repository

Cui, Xuefeng

2016-06-15

Motivation: Protein homology detection, a fundamental problem in computational biology, is an indispensable step toward predicting protein structures and understanding protein functions. Despite the advances in recent decades on sequence alignment, threading and alignment-free methods, protein homology detection remains a challenging open problem. Recently, network methods that try to find transitive paths in the protein structure space demonstrate the importance of incorporating network information of the structure space. Yet, current methods merge the sequence space and the structure space into a single space, and thus introduce inconsistency in combining different sources of information. Method: We present a novel network-based protein homology detection method, CMsearch, based on cross-modal learning. Instead of exploring a single network built from the mixture of sequence and structure space information, CMsearch builds two separate networks to represent the sequence space and the structure space. It then learns sequence–structure correlation by simultaneously taking sequence information, structure information, sequence space information and structure space information into consideration. Results: We tested CMsearch on two challenging tasks, protein homology detection and protein structure prediction, by querying all 8332 PDB40 proteins. Our results demonstrate that CMsearch is insensitive to the similarity metrics used to define the sequence and the structure spaces. By using HMM–HMM alignment as the sequence similarity metric, CMsearch clearly outperforms state-of-the-art homology detection methods and the CASP-winning template-based protein structure prediction methods.
Protein 8-class secondary structure prediction using conditional neural fields.

Science.gov (United States)

Wang, Zhiyong; Zhao, Feng; Peng, Jian; Xu, Jinbo

2011-10-01

Compared with the protein 3-class secondary structure (SS) prediction, the 8-class prediction gains less attention and is also much more challenging, especially for proteins with few sequence homologs. This paper presents a new probabilistic method for 8-class SS prediction using conditional neural fields (CNFs), a recently invented probabilistic graphical model. This CNF method not only models the complex relationship between sequence features and SS, but also exploits the interdependency among SS types of adjacent residues. In addition to sequence profiles, our method also makes use of non-evolutionary information for SS prediction. Tested on the CB513 and RS126 data sets, our method achieves Q8 accuracy of 64.9 and 64.7%, respectively, which are much better than the SSpro8 web server (51.0 and 48.0%, respectively). Our method can also be used to predict other structure properties (e.g. solvent accessibility) of a protein or the SS of RNA. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Prediction of protein-protein interactions in dengue virus coat proteins guided by low resolution cryoEM structures

Directory of Open Access Journals (Sweden)

Srinivasan Narayanaswamy

2010-06-01

Full Text Available Abstract Background Dengue virus along with the other members of the flaviviridae family has reemerged as deadly human pathogens. Understanding the mechanistic details of these infections can be highly rewarding in developing effective antivirals. During maturation of the virus inside the host cell, the coat proteins E and M undergo conformational changes, altering the morphology of the viral coat. However, due to low resolution nature of the available 3-D structures of viral assemblies, the atomic details of these changes are still elusive. Results In the present analysis, starting from Cα positions of low resolution cryo electron microscopic structures the residue level details of protein-protein interaction interfaces of dengue virus coat proteins have been predicted. By comparing the preexisting structures of virus in different phases of life cycle, the changes taking place in these predicted protein-protein interaction interfaces were followed as a function of maturation process of the virus. Besides changing the current notion about the presence of only homodimers in the mature viral coat, the present analysis indicated presence of a proline-rich motif at the protein-protein interaction interface of the coat protein. Investigating the conservation status of these seemingly functionally crucial residues across other members of flaviviridae family enabled dissecting common mechanisms used for infections by these viruses. Conclusions Thus, using computational approach the present analysis has provided better insights into the preexisting low resolution structures of virus assemblies, the findings of which can be made use of in designing effective antivirals against these deadly human pathogens.
Prediction of Protein–Protein Interactions by Evidence Combining Methods

Directory of Open Access Journals (Sweden)

Ji-Wei Chang

2016-11-01

Full Text Available Most cellular functions involve proteins’ features based on their physical interactions with other partner proteins. Sketching a map of protein–protein interactions (PPIs is therefore an important inception step towards understanding the basics of cell functions. Several experimental techniques operating in vivo or in vitro have made significant contributions to screening a large number of protein interaction partners, especially high-throughput experimental methods. However, computational approaches for PPI predication supported by rapid accumulation of data generated from experimental techniques, 3D structure definitions, and genome sequencing have boosted the map sketching of PPIs. In this review, we shed light on in silico PPI prediction methods that integrate evidence from multiple sources, including evolutionary relationship, function annotation, sequence/structure features, network topology and text mining. These methods are developed for integration of multi-dimensional evidence, for designing the strategies to predict novel interactions, and for making the results consistent with the increase of prediction coverage and accuracy.
Update on protein structure prediction: results of the 1995 IRBM workshop

DEFF Research Database (Denmark)

Hubbard, Tim; Tramontano, Anna; Hansen, Jan

1996-01-01

Computational tools for protein structure prediction are of great interest to molecular, structural and theoretical biologists due to a rapidly increasing number of protein sequences with no known structure. In October 1995, a workshop was held at IRBM to predict as much as possible about a numbe...
Update on protein structure prediction: results of the 1995 IRBM workshop

DEFF Research Database (Denmark)

Hubbard, Tim; Tramontano, Anna; Hansen, Jan

1996-01-01

Computational tools for protein structure prediction are of great interest to molecular, structural and theoretical biologists due to a rapidly increasing number of protein sequences with no known structure. In October 1995, a workshop was held at IRBM to predict as much as possible about a number...
Predicting pKa for proteins using COSMO-RS

DEFF Research Database (Denmark)

Andersson, Martin Peter; Jensen, Jan Halborg; Stipp, Susan Louise Svane

2013-01-01

We have used the COSMO-RS implicit solvation method to calculate the equilibrium constants, pKa, for deprotonation of the acidic residues of the ovomucoid inhibitor protein, OMTKY3. The root mean square error for comparison with experimental data is only 0.5 pH units and the maximum error 0.8 p......H units. The results show that the accuracy of pKa prediction using COSMO-RS is as good for large biomolecules as it is for smaller inorganic and organic acids and that the method compares very well to previous pKa predictions of the OMTKY3 protein using Quantum Mechanics/Molecular Mechanics. Our approach...
Knowledge base and neural network approach for protein secondary structure prediction.

Science.gov (United States)

Patel, Maulika S; Mazumdar, Himanshu S

2014-11-21

Protein structure prediction is of great relevance given the abundant genomic and proteomic data generated by the genome sequencing projects. Protein secondary structure prediction is addressed as a sub task in determining the protein tertiary structure and function. In this paper, a novel algorithm, KB-PROSSP-NN, which is a combination of knowledge base and modeling of the exceptions in the knowledge base using neural networks for protein secondary structure prediction (PSSP), is proposed. The knowledge base is derived from a proteomic sequence-structure database and consists of the statistics of association between the 5-residue words and corresponding secondary structure. The predicted results obtained using knowledge base are refined with a Backpropogation neural network algorithm. Neural net models the exceptions of the knowledge base. The Q3 accuracy of 90% and 82% is achieved on the RS126 and CB396 test sets respectively which suggest improvement over existing state of art methods. Copyright © 2014 Elsevier Ltd. All rights reserved.
ComplexContact: a web server for inter-protein contact prediction using deep learning

KAUST Repository

Zeng, Hong; Wang, Sheng; Zhou, Tianming; Zhao, Feifeng; Li, Xiufeng; Wu, Qing; Xu, Jinbo

2018-01-01

ComplexContact (http://raptorx2.uchicago.edu/ComplexContact/) is a web server for sequence-based interfacial residue-residue contact prediction of a putative protein complex. Interfacial residue-residue contacts are critical for understanding how proteins form complex and interact at residue level. When receiving a pair of protein sequences, ComplexContact first searches for their sequence homologs and builds two paired multiple sequence alignments (MSA), then it applies co-evolution analysis and a CASP-winning deep learning (DL) method to predict interfacial contacts from paired MSAs and visualizes the prediction as an image. The DL method was originally developed for intra-protein contact prediction and performed the best in CASP12. Our large-scale experimental test further shows that ComplexContact greatly outperforms pure co-evolution methods for inter-protein contact prediction, regardless of the species.
ComplexContact: a web server for inter-protein contact prediction using deep learning

KAUST Repository

Zeng, Hong

2018-05-20

ComplexContact (http://raptorx2.uchicago.edu/ComplexContact/) is a web server for sequence-based interfacial residue-residue contact prediction of a putative protein complex. Interfacial residue-residue contacts are critical for understanding how proteins form complex and interact at residue level. When receiving a pair of protein sequences, ComplexContact first searches for their sequence homologs and builds two paired multiple sequence alignments (MSA), then it applies co-evolution analysis and a CASP-winning deep learning (DL) method to predict interfacial contacts from paired MSAs and visualizes the prediction as an image. The DL method was originally developed for intra-protein contact prediction and performed the best in CASP12. Our large-scale experimental test further shows that ComplexContact greatly outperforms pure co-evolution methods for inter-protein contact prediction, regardless of the species.

ComplexContact: a web server for inter-protein contact prediction using deep learning.

Science.gov (United States)

Zeng, Hong; Wang, Sheng; Zhou, Tianming; Zhao, Feifeng; Li, Xiufeng; Wu, Qing; Xu, Jinbo

2018-05-22

ComplexContact (http://raptorx2.uchicago.edu/ComplexContact/) is a web server for sequence-based interfacial residue-residue contact prediction of a putative protein complex. Interfacial residue-residue contacts are critical for understanding how proteins form complex and interact at residue level. When receiving a pair of protein sequences, ComplexContact first searches for their sequence homologs and builds two paired multiple sequence alignments (MSA), then it applies co-evolution analysis and a CASP-winning deep learning (DL) method to predict interfacial contacts from paired MSAs and visualizes the prediction as an image. The DL method was originally developed for intra-protein contact prediction and performed the best in CASP12. Our large-scale experimental test further shows that ComplexContact greatly outperforms pure co-evolution methods for inter-protein contact prediction, regardless of the species.
Incorporating information on predicted solvent accessibility to the co-evolution-based study of protein interactions.

Science.gov (United States)

Ochoa, David; García-Gutiérrez, Ponciano; Juan, David; Valencia, Alfonso; Pazos, Florencio

2013-01-27

A widespread family of methods for studying and predicting protein interactions using sequence information is based on co-evolution, quantified as similarity of phylogenetic trees. Part of the co-evolution observed between interacting proteins could be due to co-adaptation caused by inter-protein contacts. In this case, the co-evolution is expected to be more evident when evaluated on the surface of the proteins or the internal layers close to it. In this work we study the effect of incorporating information on predicted solvent accessibility to three methods for predicting protein interactions based on similarity of phylogenetic trees. We evaluate the performance of these methods in predicting different types of protein associations when trees based on positions with different characteristics of predicted accessibility are used as input. We found that predicted accessibility improves the results of two recent versions of the mirrortree methodology in predicting direct binary physical interactions, while it neither improves these methods, nor the original mirrortree method, in predicting other types of interactions. That improvement comes at no cost in terms of applicability since accessibility can be predicted for any sequence. We also found that predictions of protein-protein interactions are improved when multiple sequence alignments with a richer representation of sequences (including paralogs) are incorporated in the accessibility prediction.
Protein feeding and balancing for amino acids in lactating dairy cattle.

Science.gov (United States)

Patton, Robert A; Hristov, Alexander N; Lapierre, Hélène

2014-11-01

This article summarizes the current literature as regards metabolizable protein (MP) and essential amino acid (EAA) nutrition of dairy cattle. Emphasis has been placed on research since the publication of the National Research Council Nutrient Requirements of Dairy Cattle, Seventh Revised Edition (2001). Postruminal metabolism of EAA is discussed in terms of the effect on requirements. This article suggests methods for practical application of MP and EAA balance in milking dairy cows. Copyright © 2014 Elsevier Inc. All rights reserved.
A Kernel for Protein Secondary Structure Prediction

OpenAIRE

Guermeur , Yann; Lifchitz , Alain; Vert , Régis

2004-01-01

http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&tid=10338&mode=toc; International audience; Multi-class support vector machines have already proved efficient in protein secondary structure prediction as ensemble methods, to combine the outputs of sets of classifiers based on different principles. In this chapter, their implementation as basic prediction methods, processing the primary structure or the profile of multiple alignments, is investigated. A kernel devoted to the task is in...
MU-LOC: A Machine-Learning Method for Predicting Mitochondrially Localized Proteins in Plants

Directory of Open Access Journals (Sweden)

Ning Zhang

2018-05-01

Full Text Available Targeting and translocation of proteins to the appropriate subcellular compartments are crucial for cell organization and function. Newly synthesized proteins are transported to mitochondria with the assistance of complex targeting sequences containing either an N-terminal pre-sequence or a multitude of internal signals. Compared with experimental approaches, computational predictions provide an efficient way to infer subcellular localization of a protein. However, it is still challenging to predict plant mitochondrially localized proteins accurately due to various limitations. Consequently, the performance of current tools can be improved with new data and new machine-learning methods. We present MU-LOC, a novel computational approach for large-scale prediction of plant mitochondrial proteins. We collected a comprehensive dataset of plant subcellular localization, extracted features including amino acid composition, protein position weight matrix, and gene co-expression information, and trained predictors using deep neural network and support vector machine. Benchmarked on two independent datasets, MU-LOC achieved substantial improvements over six state-of-the-art tools for plant mitochondrial targeting prediction. In addition, MU-LOC has the advantage of predicting plant mitochondrial proteins either possessing or lacking N-terminal pre-sequences. We applied MU-LOC to predict candidate mitochondrial proteins for the whole proteome of Arabidopsis and potato. MU-LOC is publicly available at http://mu-loc.org.
I-TASSER server for protein 3D structure prediction

Directory of Open Access Journals (Sweden)

Zhang Yang

2008-01-01

Full Text Available Abstract Background Prediction of 3-dimensional protein structures from amino acid sequences represents one of the most important problems in computational structural biology. The community-wide Critical Assessment of Structure Prediction (CASP experiments have been designed to obtain an objective assessment of the state-of-the-art of the field, where I-TASSER was ranked as the best method in the server section of the recent 7th CASP experiment. Our laboratory has since then received numerous requests about the public availability of the I-TASSER algorithm and the usage of the I-TASSER predictions. Results An on-line version of I-TASSER is developed at the KU Center for Bioinformatics which has generated protein structure predictions for thousands of modeling requests from more than 35 countries. A scoring function (C-score based on the relative clustering structural density and the consensus significance score of multiple threading templates is introduced to estimate the accuracy of the I-TASSER predictions. A large-scale benchmark test demonstrates a strong correlation between the C-score and the TM-score (a structural similarity measurement with values in [0, 1] of the first models with a correlation coefficient of 0.91. Using a C-score cutoff > -1.5 for the models of correct topology, both false positive and false negative rates are below 0.1. Combining C-score and protein length, the accuracy of the I-TASSER models can be predicted with an average error of 0.08 for TM-score and 2 Å for RMSD. Conclusion The I-TASSER server has been developed to generate automated full-length 3D protein structural predictions where the benchmarked scoring system helps users to obtain quantitative assessments of the I-TASSER models. The output of the I-TASSER server for each query includes up to five full-length models, the confidence score, the estimated TM-score and RMSD, and the standard deviation of the estimations. The I-TASSER server is freely available
Improving the accuracy of protein secondary structure prediction using structural alignment

Directory of Open Access Journals (Sweden)

Gallin Warren J

2006-06-01

Full Text Available Abstract Background The accuracy of protein secondary structure prediction has steadily improved over the past 30 years. Now many secondary structure prediction methods routinely achieve an accuracy (Q3 of about 75%. We believe this accuracy could be further improved by including structure (as opposed to sequence database comparisons as part of the prediction process. Indeed, given the large size of the Protein Data Bank (>35,000 sequences, the probability of a newly identified sequence having a structural homologue is actually quite high. Results We have developed a method that performs structure-based sequence alignments as part of the secondary structure prediction process. By mapping the structure of a known homologue (sequence ID >25% onto the query protein's sequence, it is possible to predict at least a portion of that query protein's secondary structure. By integrating this structural alignment approach with conventional (sequence-based secondary structure methods and then combining it with a "jury-of-experts" system to generate a consensus result, it is possible to attain very high prediction accuracy. Using a sequence-unique test set of 1644 proteins from EVA, this new method achieves an average Q3 score of 81.3%. Extensive testing indicates this is approximately 4–5% better than any other method currently available. Assessments using non sequence-unique test sets (typical of those used in proteome annotation or structural genomics indicate that this new method can achieve a Q3 score approaching 88%. Conclusion By using both sequence and structure databases and by exploiting the latest techniques in machine learning it is possible to routinely predict protein secondary structure with an accuracy well above 80%. A program and web server, called PROTEUS, that performs these secondary structure predictions is accessible at http://wishart.biology.ualberta.ca/proteus. For high throughput or batch sequence analyses, the PROTEUS programs
Incorporating functional inter-relationships into protein function prediction algorithms

Directory of Open Access Journals (Sweden)

Kumar Vipin

2009-05-01

Full Text Available Abstract Background Functional classification schemes (e.g. the Gene Ontology that serve as the basis for annotation efforts in several organisms are often the source of gold standard information for computational efforts at supervised protein function prediction. While successful function prediction algorithms have been developed, few previous efforts have utilized more than the protein-to-functional class label information provided by such knowledge bases. For instance, the Gene Ontology not only captures protein annotations to a set of functional classes, but it also arranges these classes in a DAG-based hierarchy that captures rich inter-relationships between different classes. These inter-relationships present both opportunities, such as the potential for additional training examples for small classes from larger related classes, and challenges, such as a harder to learn distinction between similar GO terms, for standard classification-based approaches. Results We propose a method to enhance the performance of classification-based protein function prediction algorithms by addressing the issue of using these interrelationships between functional classes constituting functional classification schemes. Using a standard measure for evaluating the semantic similarity between nodes in an ontology, we quantify and incorporate these inter-relationships into the k-nearest neighbor classifier. We present experiments on several large genomic data sets, each of which is used for the modeling and prediction of over hundred classes from the GO Biological Process ontology. The results show that this incorporation produces more accurate predictions for a large number of the functional classes considered, and also that the classes benefitted most by this approach are those containing the fewest members. In addition, we show how our proposed framework can be used for integrating information from the entire GO hierarchy for improving the accuracy of
Composição química, digestibilidade e predição dos valores energéticos da farinha de carne e ossos para suínos = Chemical composition, digestibility and prediction of the energy values of meat and bone meal for swine

Directory of Open Access Journals (Sweden)

Paulo Cesar Pozza

2008-01-01

Full Text Available O objetivo do trabalho foi determinar a composição química e energética de seis diferentes farinhas de carne e ossos, bem como desenvolver equações de predição da energia digestível e metabolizável, com base na composição química dos alimentos. Foramutilizados 28 suínos, mestiços, machos castrados, com peso médio inicial de 25,90 ± 1,95 kg, distribuídos em delineamento experimental de blocos ao acaso, com sete tratamentos, quatro repetições e um animal por unidade experimental. Os tratamentos consistiram deuma ração-referência e seis diferentes farinhas de carne e ossos, que substituíram em 20% a ração-referência. Os valores de energia digestível e metabolizável variaram de 1.717 a 2.908 kcal kg-1 e de 1.519 a 2.608 kcal kg-1, respectivamente. As equações de predição da energia digestível e metabolizável que apresentaram maiores R2 para a farinha de carne e ossos foram: ED = 1.196,11 + 44,18 PB – 121,55 P e EM = 2.103,35 + 22,56 PB – 164,02 P.The objective of this study was to determine the chemical and energetic composition of six different meat and bone meals, and todevelop prediction equations of digestible and metabolizable energy based on the chemical composition of the feeds. In order to determine the digestible and metabolizable energy values, 28 crossbreed swine were used – castrated males, averaging 25.90 ± 1.95 kg initialweight, allotted in a randomized block design with seven treatments, four replicates and one animal per experimental unit. The treatments consisted of a basal diet and six meat and bone meals, which replaced by 20% the basal diet. The digestible and metabolizable energyvalues varied from 1717 to 2908 kcal kg-1 and from 1519 to 2608 kcal kg-1, respectively. The prediction equation of digestible and metabolizable energy that presented the highest R2 for meat and bone meal were: DE = 1196.11 + 44.18 CP – 121.55 P and ME = 2103.35 +22.56 CP – 164.02 P.
Prediction of essential proteins based on subcellular localization and gene expression correlation.

Science.gov (United States)

Fan, Yetian; Tang, Xiwei; Hu, Xiaohua; Wu, Wei; Ping, Qing

2017-12-01

Essential proteins are indispensable to the survival and development process of living organisms. To understand the functional mechanisms of essential proteins, which can be applied to the analysis of disease and design of drugs, it is important to identify essential proteins from a set of proteins first. As traditional experimental methods designed to test out essential proteins are usually expensive and laborious, computational methods, which utilize biological and topological features of proteins, have attracted more attention in recent years. Protein-protein interaction networks, together with other biological data, have been explored to improve the performance of essential protein prediction. The proposed method SCP is evaluated on Saccharomyces cerevisiae datasets and compared with five other methods. The results show that our method SCP outperforms the other five methods in terms of accuracy of essential protein prediction. In this paper, we propose a novel algorithm named SCP, which combines the ranking by a modified PageRank algorithm based on subcellular compartments information, with the ranking by Pearson correlation coefficient (PCC) calculated from gene expression data. Experiments show that subcellular localization information is promising in boosting essential protein prediction.
Parallel protein secondary structure prediction based on neural networks.

Science.gov (United States)

Zhong, Wei; Altun, Gulsah; Tian, Xinmin; Harrison, Robert; Tai, Phang C; Pan, Yi

2004-01-01

Protein secondary structure prediction has a fundamental influence on today's bioinformatics research. In this work, binary and tertiary classifiers of protein secondary structure prediction are implemented on Denoeux belief neural network (DBNN) architecture. Hydrophobicity matrix, orthogonal matrix, BLOSUM62 and PSSM (position specific scoring matrix) are experimented separately as the encoding schemes for DBNN. The experimental results contribute to the design of new encoding schemes. New binary classifier for Helix versus not Helix ( approximately H) for DBNN produces prediction accuracy of 87% when PSSM is used for the input profile. The performance of DBNN binary classifier is comparable to other best prediction methods. The good test results for binary classifiers open a new approach for protein structure prediction with neural networks. Due to the time consuming task of training the neural networks, Pthread and OpenMP are employed to parallelize DBNN in the hyperthreading enabled Intel architecture. Speedup for 16 Pthreads is 4.9 and speedup for 16 OpenMP threads is 4 in the 4 processors shared memory architecture. Both speedup performance of OpenMP and Pthread is superior to that of other research. With the new parallel training algorithm, thousands of amino acids can be processed in reasonable amount of time. Our research also shows that hyperthreading technology for Intel architecture is efficient for parallel biological algorithms.
Using support vector machine to predict beta- and gamma-turns in proteins.

Science.gov (United States)

Hu, Xiuzhen; Li, Qianzhong

2008-09-01

By using the composite vector with increment of diversity, position conservation scoring function, and predictive secondary structures to express the information of sequence, a support vector machine (SVM) algorithm for predicting beta- and gamma-turns in the proteins is proposed. The 426 and 320 nonhomologous protein chains described by Guruprasad and Rajkumar (Guruprasad and Rajkumar J. Biosci 2000, 25,143) are used for training and testing the predictive model of the beta- and gamma-turns, respectively. The overall prediction accuracy and the Matthews correlation coefficient in 7-fold cross-validation are 79.8% and 0.47, respectively, for the beta-turns. The overall prediction accuracy in 5-fold cross-validation is 61.0% for the gamma-turns. These results are significantly higher than the other algorithms in the prediction of beta- and gamma-turns using the same datasets. In addition, the 547 and 823 nonhomologous protein chains described by Fuchs and Alix (Fuchs and Alix Proteins: Struct Funct Bioinform 2005, 59, 828) are used for training and testing the predictive model of the beta- and gamma-turns, and better results are obtained. This algorithm may be helpful to improve the performance of protein turns' prediction. To ensure the ability of the SVM method to correctly classify beta-turn and non-beta-turn (gamma-turn and non-gamma-turn), the receiver operating characteristic threshold independent measure curves are provided. (c) 2008 Wiley Periodicals, Inc.
Plasma proteins predict conversion to dementia from prodromal disease.

Science.gov (United States)

Hye, Abdul; Riddoch-Contreras, Joanna; Baird, Alison L; Ashton, Nicholas J; Bazenet, Chantal; Leung, Rufina; Westman, Eric; Simmons, Andrew; Dobson, Richard; Sattlecker, Martina; Lupton, Michelle; Lunnon, Katie; Keohane, Aoife; Ward, Malcolm; Pike, Ian; Zucht, Hans Dieter; Pepin, Danielle; Zheng, Wei; Tunnicliffe, Alan; Richardson, Jill; Gauthier, Serge; Soininen, Hilkka; Kłoszewska, Iwona; Mecocci, Patrizia; Tsolaki, Magda; Vellas, Bruno; Lovestone, Simon

2014-11-01

The study aimed to validate previously discovered plasma biomarkers associated with AD, using a design based on imaging measures as surrogate for disease severity and assess their prognostic value in predicting conversion to dementia. Three multicenter cohorts of cognitively healthy elderly, mild cognitive impairment (MCI), and AD participants with standardized clinical assessments and structural neuroimaging measures were used. Twenty-six candidate proteins were quantified in 1148 subjects using multiplex (xMAP) assays. Sixteen proteins correlated with disease severity and cognitive decline. Strongest associations were in the MCI group with a panel of 10 proteins predicting progression to AD (accuracy 87%, sensitivity 85%, and specificity 88%). We have identified 10 plasma proteins strongly associated with disease severity and disease progression. Such markers may be useful for patient selection for clinical trials and assessment of patients with predisease subjective memory complaints. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.
Enhancing the prediction of protein pairings between interacting families using orthology information

Directory of Open Access Journals (Sweden)

Pazos Florencio

2008-01-01

Full Text Available Abstract Background It has repeatedly been shown that interacting protein families tend to have similar phylogenetic trees. These similarities can be used to predicting the mapping between two families of interacting proteins (i.e. which proteins from one family interact with which members of the other. The correct mapping will be that which maximizes the similarity between the trees. The two families may eventually comprise orthologs and paralogs, if members of the two families are present in more than one organism. This fact can be exploited to restrict the possible mappings, simply by impeding links between proteins of different organisms. We present here an algorithm to predict the mapping between families of interacting proteins which is able to incorporate information regarding orthologues, or any other assignment of proteins to "classes" that may restrict possible mappings. Results For the first time in methods for predicting mappings, we have tested this new approach on a large number of interacting protein domains in order to statistically assess its performance. The method accurately predicts around 80% in the most favourable cases. We also analysed in detail the results of the method for a well defined case of interacting families, the sensor and kinase components of the Ntr-type two-component system, for which up to 98% of the pairings predicted by the method were correct. Conclusion Based on the well established relationship between tree similarity and interactions we developed a method for predicting the mapping between two interacting families using genomic information alone. The program is available through a web interface.
Analysis of substructural variation in families of enzymatic proteins with applications to protein function prediction

Directory of Open Access Journals (Sweden)

Fofanov Viacheslav Y

2010-05-01

Full Text Available Abstract Background Structural variations caused by a wide range of physico-chemical and biological sources directly influence the function of a protein. For enzymatic proteins, the structure and chemistry of the catalytic binding site residues can be loosely defined as a substructure of the protein. Comparative analysis of drug-receptor substructures across and within species has been used for lead evaluation. Substructure-level similarity between the binding sites of functionally similar proteins has also been used to identify instances of convergent evolution among proteins. In functionally homologous protein families, shared chemistry and geometry at catalytic sites provide a common, local point of comparison among proteins that may differ significantly at the sequence, fold, or domain topology levels. Results This paper describes two key results that can be used separately or in combination for protein function analysis. The Family-wise Analysis of SubStructural Templates (FASST method uses all-against-all substructure comparison to determine Substructural Clusters (SCs. SCs characterize the binding site substructural variation within a protein family. In this paper we focus on examples of automatically determined SCs that can be linked to phylogenetic distance between family members, segregation by conformation, and organization by homology among convergent protein lineages. The Motif Ensemble Statistical Hypothesis (MESH framework constructs a representative motif for each protein cluster among the SCs determined by FASST to build motif ensembles that are shown through a series of function prediction experiments to improve the function prediction power of existing motifs. Conclusions FASST contributes a critical feedback and assessment step to existing binding site substructure identification methods and can be used for the thorough investigation of structure-function relationships. The application of MESH allows for an automated
Computational Prediction of Human Salivary Proteins from Blood Circulation and Application to Diagnostic Biomarker Identification

Science.gov (United States)

Wang, Jiaxin; Liang, Yanchun; Wang, Yan; Cui, Juan; Liu, Ming; Du, Wei; Xu, Ying

2013-01-01

Proteins can move from blood circulation into salivary glands through active transportation, passive diffusion or ultrafiltration, some of which are then released into saliva and hence can potentially serve as biomarkers for diseases if accurately identified. We present a novel computational method for predicting salivary proteins that come from circulation. The basis for the prediction is a set of physiochemical and sequence features we found to be discerning between human proteins known to be movable from circulation to saliva and proteins deemed to be not in saliva. A classifier was trained based on these features using a support-vector machine to predict protein secretion into saliva. The classifier achieved 88.56% average recall and 90.76% average precision in 10-fold cross-validation on the training data, indicating that the selected features are informative. Considering the possibility that our negative training data may not be highly reliable (i.e., proteins predicted to be not in saliva), we have also trained a ranking method, aiming to rank the known salivary proteins from circulation as the highest among the proteins in the general background, based on the same features. This prediction capability can be used to predict potential biomarker proteins for specific human diseases when coupled with the information of differentially expressed proteins in diseased versus healthy control tissues and a prediction capability for blood-secretory proteins. Using such integrated information, we predicted 31 candidate biomarker proteins in saliva for breast cancer. PMID:24324552
A large-scale evaluation of computational protein function prediction

NARCIS (Netherlands)

Radivojac, P.; Clark, W.T.; Oron, T.R.; Schnoes, A.M.; Wittkop, T.; Kourmpetis, Y.A.I.; Dijk, van A.D.J.; Friedberg, I.

2013-01-01

Automated annotation of protein function is challenging. As the number of sequenced genomes rapidly grows, the overwhelming majority of protein products can only be annotated computationally. If computational predictions are to be relied upon, it is crucial that the accuracy of these methods be
HMMBinder: DNA-Binding Protein Prediction Using HMM Profile Based Features.

Science.gov (United States)

Zaman, Rianon; Chowdhury, Shahana Yasmin; Rashid, Mahmood A; Sharma, Alok; Dehzangi, Abdollah; Shatabda, Swakkhar

2017-01-01

DNA-binding proteins often play important role in various processes within the cell. Over the last decade, a wide range of classification algorithms and feature extraction techniques have been used to solve this problem. In this paper, we propose a novel DNA-binding protein prediction method called HMMBinder. HMMBinder uses monogram and bigram features extracted from the HMM profiles of the protein sequences. To the best of our knowledge, this is the first application of HMM profile based features for the DNA-binding protein prediction problem. We applied Support Vector Machines (SVM) as a classification technique in HMMBinder. Our method was tested on standard benchmark datasets. We experimentally show that our method outperforms the state-of-the-art methods found in the literature.
HMMBinder: DNA-Binding Protein Prediction Using HMM Profile Based Features

Directory of Open Access Journals (Sweden)

Rianon Zaman

2017-01-01

Full Text Available DNA-binding proteins often play important role in various processes within the cell. Over the last decade, a wide range of classification algorithms and feature extraction techniques have been used to solve this problem. In this paper, we propose a novel DNA-binding protein prediction method called HMMBinder. HMMBinder uses monogram and bigram features extracted from the HMM profiles of the protein sequences. To the best of our knowledge, this is the first application of HMM profile based features for the DNA-binding protein prediction problem. We applied Support Vector Machines (SVM as a classification technique in HMMBinder. Our method was tested on standard benchmark datasets. We experimentally show that our method outperforms the state-of-the-art methods found in the literature.
Automatic generation of bioinformatics tools for predicting protein-ligand binding sites.

Science.gov (United States)

Komiyama, Yusuke; Banno, Masaki; Ueki, Kokoro; Saad, Gul; Shimizu, Kentaro

2016-03-15

Predictive tools that model protein-ligand binding on demand are needed to promote ligand research in an innovative drug-design environment. However, it takes considerable time and effort to develop predictive tools that can be applied to individual ligands. An automated production pipeline that can rapidly and efficiently develop user-friendly protein-ligand binding predictive tools would be useful. We developed a system for automatically generating protein-ligand binding predictions. Implementation of this system in a pipeline of Semantic Web technique-based web tools will allow users to specify a ligand and receive the tool within 0.5-1 day. We demonstrated high prediction accuracy for three machine learning algorithms and eight ligands. The source code and web application are freely available for download at http://utprot.net They are implemented in Python and supported on Linux. shimizu@bi.a.u-tokyo.ac.jp Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

Distance matrix-based approach to protein structure prediction.

Science.gov (United States)

Kloczkowski, Andrzej; Jernigan, Robert L; Wu, Zhijun; Song, Guang; Yang, Lei; Kolinski, Andrzej; Pokarowski, Piotr

2009-03-01

Much structural information is encoded in the internal distances; a distance matrix-based approach can be used to predict protein structure and dynamics, and for structural refinement. Our approach is based on the square distance matrix D = [r(ij)(2)] containing all square distances between residues in proteins. This distance matrix contains more information than the contact matrix C, that has elements of either 0 or 1 depending on whether the distance r (ij) is greater or less than a cutoff value r (cutoff). We have performed spectral decomposition of the distance matrices D = sigma lambda(k)V(k)V(kT), in terms of eigenvalues lambda kappa and the corresponding eigenvectors v kappa and found that it contains at most five nonzero terms. A dominant eigenvector is proportional to r (2)--the square distance of points from the center of mass, with the next three being the principal components of the system of points. By predicting r (2) from the sequence we can approximate a distance matrix of a protein with an expected RMSD value of about 7.3 A, and by combining it with the prediction of the first principal component we can improve this approximation to 4.0 A. We can also explain the role of hydrophobic interactions for the protein structure, because r is highly correlated with the hydrophobic profile of the sequence. Moreover, r is highly correlated with several sequence profiles which are useful in protein structure prediction, such as contact number, the residue-wise contact order (RWCO) or mean square fluctuations (i.e. crystallographic temperature factors). We have also shown that the next three components are related to spatial directionality of the secondary structure elements, and they may be also predicted from the sequence, improving overall structure prediction. We have also shown that the large number of available HIV-1 protease structures provides a remarkable sampling of conformations, which can be viewed as direct structural information about the
Prediction of Hydrophobic Cores of Proteins Using Wavelet Analysis.

Science.gov (United States)

Hirakawa; Kuhara

1997-01-01

Information concerning the secondary structures, flexibility, epitope and hydrophobic regions of amino acid sequences can be extracted by assigning physicochemical indices to each amino acid residue, and information on structure can be derived using the sliding window averaging technique, which is in wide use for smoothing out raw functions. Wavelet analysis has shown great potential and applicability in many fields, such as astronomy, radar, earthquake prediction, and signal or image processing. This approach is efficient for removing noise from various functions. Here we employed wavelet analysis to smooth out a plot assigned to a hydrophobicity index for amino acid sequences. We then used the resulting function to predict hydrophobic cores in globular proteins. We calculated the prediction accuracy for the hydrophobic cores of 88 representative set of proteins. Use of wavelet analysis made feasible the prediction of hydrophobic cores at 6.13% greater accuracy than the sliding window averaging technique.
(PS)2: protein structure prediction server version 3.0.

Science.gov (United States)

Huang, Tsun-Tsao; Hwang, Jenn-Kang; Chen, Chu-Huang; Chu, Chih-Sheng; Lee, Chi-Wen; Chen, Chih-Chieh

2015-07-01

Protein complexes are involved in many biological processes. Examining coupling between subunits of a complex would be useful to understand the molecular basis of protein function. Here, our updated (PS)(2) web server predicts the three-dimensional structures of protein complexes based on comparative modeling; furthermore, this server examines the coupling between subunits of the predicted complex by combining structural and evolutionary considerations. The predicted complex structure could be indicated and visualized by Java-based 3D graphics viewers and the structural and evolutionary profiles are shown and compared chain-by-chain. For each subunit, considerations with or without the packing contribution of other subunits cause the differences in similarities between structural and evolutionary profiles, and these differences imply which form, complex or monomeric, is preferred in the biological condition for the subunit. We believe that the (PS)(2) server would be a useful tool for biologists who are interested not only in the structures of protein complexes but also in the coupling between subunits of the complexes. The (PS)(2) is freely available at http://ps2v3.life.nctu.edu.tw/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Predicting protein structures with a multiplayer online game.

Science.gov (United States)

Cooper, Seth; Khatib, Firas; Treuille, Adrien; Barbero, Janos; Lee, Jeehyung; Beenen, Michael; Leaver-Fay, Andrew; Baker, David; Popović, Zoran; Players, Foldit

2010-08-05

People exert large amounts of problem-solving effort playing computer games. Simple image- and text-recognition tasks have been successfully 'crowd-sourced' through games, but it is not clear if more complex scientific problems can be solved with human-directed computing. Protein structure prediction is one such problem: locating the biologically relevant native conformation of a protein is a formidable computational challenge given the very large size of the search space. Here we describe Foldit, a multiplayer online game that engages non-scientists in solving hard prediction problems. Foldit players interact with protein structures using direct manipulation tools and user-friendly versions of algorithms from the Rosetta structure prediction methodology, while they compete and collaborate to optimize the computed energy. We show that top-ranked Foldit players excel at solving challenging structure refinement problems in which substantial backbone rearrangements are necessary to achieve the burial of hydrophobic residues. Players working collaboratively develop a rich assortment of new strategies and algorithms; unlike computational approaches, they explore not only the conformational space but also the space of possible search strategies. The integration of human visual problem-solving and strategy development capabilities with traditional computational algorithms through interactive multiplayer games is a powerful new approach to solving computationally-limited scientific problems.
An ensemble method for predicting subnuclear localizations from primary protein structures.

Directory of Open Access Journals (Sweden)

Guo Sheng Han

Full Text Available BACKGROUND: Predicting protein subnuclear localization is a challenging problem. Some previous works based on non-sequence information including Gene Ontology annotations and kernel fusion have respective limitations. The aim of this work is twofold: one is to propose a novel individual feature extraction method; another is to develop an ensemble method to improve prediction performance using comprehensive information represented in the form of high dimensional feature vector obtained by 11 feature extraction methods. METHODOLOGY/PRINCIPAL FINDINGS: A novel two-stage multiclass support vector machine is proposed to predict protein subnuclear localizations. It only considers those feature extraction methods based on amino acid classifications and physicochemical properties. In order to speed up our system, an automatic search method for the kernel parameter is used. The prediction performance of our method is evaluated on four datasets: Lei dataset, multi-localization dataset, SNL9 dataset and a new independent dataset. The overall accuracy of prediction for 6 localizations on Lei dataset is 75.2% and that for 9 localizations on SNL9 dataset is 72.1% in the leave-one-out cross validation, 71.7% for the multi-localization dataset and 69.8% for the new independent dataset, respectively. Comparisons with those existing methods show that our method performs better for both single-localization and multi-localization proteins and achieves more balanced sensitivities and specificities on large-size and small-size subcellular localizations. The overall accuracy improvements are 4.0% and 4.7% for single-localization proteins and 6.5% for multi-localization proteins. The reliability and stability of our classification model are further confirmed by permutation analysis. CONCLUSIONS: It can be concluded that our method is effective and valuable for predicting protein subnuclear localizations. A web server has been designed to implement the proposed method
GRIP: A web-based system for constructing Gold Standard datasets for protein-protein interaction prediction

Directory of Open Access Journals (Sweden)

Zheng Huiru

2009-01-01

Full Text Available Abstract Background Information about protein interaction networks is fundamental to understanding protein function and cellular processes. Interaction patterns among proteins can suggest new drug targets and aid in the design of new therapeutic interventions. Efforts have been made to map interactions on a proteomic-wide scale using both experimental and computational techniques. Reference datasets that contain known interacting proteins (positive cases and non-interacting proteins (negative cases are essential to support computational prediction and validation of protein-protein interactions. Information on known interacting and non interacting proteins are usually stored within databases. Extraction of these data can be both complex and time consuming. Although, the automatic construction of reference datasets for classification is a useful resource for researchers no public resource currently exists to perform this task. Results GRIP (Gold Reference dataset constructor from Information on Protein complexes is a web-based system that provides researchers with the functionality to create reference datasets for protein-protein interaction prediction in Saccharomyces cerevisiae. Both positive and negative cases for a reference dataset can be extracted, organised and downloaded by the user. GRIP also provides an upload facility whereby users can submit proteins to determine protein complex membership. A search facility is provided where a user can search for protein complex information in Saccharomyces cerevisiae. Conclusion GRIP is developed to retrieve information on protein complex, cellular localisation, and physical and genetic interactions in Saccharomyces cerevisiae. Manual construction of reference datasets can be a time consuming process requiring programming knowledge. GRIP simplifies and speeds up this process by allowing users to automatically construct reference datasets. GRIP is free to access at http://rosalind.infj.ulst.ac.uk/GRIP/.
Feature Selection and the Class Imbalance Problem in Predicting Protein Function from Sequence

NARCIS (Netherlands)

Al-Shahib, A.; Breitling, R.; Gilbert, D.

2005-01-01

Abstract: When the standard approach to predict protein function by sequence homology fails, other alternative methods can be used that require only the amino acid sequence for predicting function. One such approach uses machine learning to predict protein function directly from amino acid sequence
PRmePRed: A protein arginine methylation prediction tool.

Directory of Open Access Journals (Sweden)

Pawan Kumar

Full Text Available Protein methylation is an important Post-Translational Modification (PTMs of proteins. Arginine methylation carries out and regulates several important biological functions, including gene regulation and signal transduction. Experimental identification of arginine methylation site is a daunting task as it is costly as well as time and labour intensive. Hence reliable prediction tools play an important task in rapid screening and identification of possible methylation sites in proteomes. Our preliminary assessment using the available prediction methods on collected data yielded unimpressive results. This motivated us to perform a comprehensive data analysis and appraisal of features relevant in the context of biological significance, that led to the development of a prediction tool PRmePRed with better performance. The PRmePRed perform reasonably well with an accuracy of 84.10%, 82.38% sensitivity, 83.77% specificity, and Matthew's correlation coefficient of 66.20% in 10-fold cross-validation. PRmePRed is freely available at http://bioinfo.icgeb.res.in/PRmePRed/.
Electrostatics, structure prediction, and the energy landscapes for protein folding and binding.

Science.gov (United States)

Tsai, Min-Yeh; Zheng, Weihua; Balamurugan, D; Schafer, Nicholas P; Kim, Bobby L; Cheung, Margaret S; Wolynes, Peter G

2016-01-01

While being long in range and therefore weakly specific, electrostatic interactions are able to modulate the stability and folding landscapes of some proteins. The relevance of electrostatic forces for steering the docking of proteins to each other is widely acknowledged, however, the role of electrostatics in establishing specifically funneled landscapes and their relevance for protein structure prediction are still not clear. By introducing Debye-Hückel potentials that mimic long-range electrostatic forces into the Associative memory, Water mediated, Structure, and Energy Model (AWSEM), a transferable protein model capable of predicting tertiary structures, we assess the effects of electrostatics on the landscapes of thirteen monomeric proteins and four dimers. For the monomers, we find that adding electrostatic interactions does not improve structure prediction. Simulations of ribosomal protein S6 show, however, that folding stability depends monotonically on electrostatic strength. The trend in predicted melting temperatures of the S6 variants agrees with experimental observations. Electrostatic effects can play a range of roles in binding. The binding of the protein complex KIX-pKID is largely assisted by electrostatic interactions, which provide direct charge-charge stabilization of the native state and contribute to the funneling of the binding landscape. In contrast, for several other proteins, including the DNA-binding protein FIS, electrostatics causes frustration in the DNA-binding region, which favors its binding with DNA but not with its protein partner. This study highlights the importance of long-range electrostatics in functional responses to problems where proteins interact with their charged partners, such as DNA, RNA, as well as membranes. © 2015 The Protein Society.
A sequence-based dynamic ensemble learning system for protein ligand-binding site prediction

KAUST Repository

Chen, Peng

2015-12-03

Background: Proteins have the fundamental ability to selectively bind to other molecules and perform specific functions through such interactions, such as protein-ligand binding. Accurate prediction of protein residues that physically bind to ligands is important for drug design and protein docking studies. Most of the successful protein-ligand binding predictions were based on known structures. However, structural information is not largely available in practice due to the huge gap between the number of known protein sequences and that of experimentally solved structures
A sequence-based dynamic ensemble learning system for protein ligand-binding site prediction

KAUST Repository

Chen, Peng; Hu, ShanShan; Zhang, Jun; Gao, Xin; Li, Jinyan; Xia, Junfeng; Wang, Bing

2015-01-01

Background: Proteins have the fundamental ability to selectively bind to other molecules and perform specific functions through such interactions, such as protein-ligand binding. Accurate prediction of protein residues that physically bind to ligands is important for drug design and protein docking studies. Most of the successful protein-ligand binding predictions were based on known structures. However, structural information is not largely available in practice due to the huge gap between the number of known protein sequences and that of experimentally solved structures
Uso da metodologia de coleta total de excretas na determinação da energia metabolizável em rações para frangos de corte ajustadas ou não quanto aos níveis de vitaminas e minerais Adjustment of the total excreta collection method for metabolizable energy determination of broiler chicken fedstuffs: consideration of vitamin and micro-minerals levels in the test diet

Directory of Open Access Journals (Sweden)

Valdir Silveira de Avila

2006-08-01

Full Text Available Avaliou-se neste estudo a influência dos teores de vitaminas e microminerais da ração-teste na determinação dos valores de energia metabolizável aparente (EMA e energia metabolizável aparente corrigida pelo nitrogênio retido (EMAn do farelo de soja. Foram comparadas rações-teste ajustadas ou não para as quantidades de cloreto de colina e premix de vitaminas e microminerais em relação à ração-referência. Adotou-se o método tradicional de coleta total de excretas utilizando-se 360 pintos de corte machos e fêmeas da linhagem Ross de 15 a 23 dias de idade, alojados em baterias metálicas com bandejas coletoras de excretas. As aves foram distribuídas em esquema de blocos casualizados, de acordo com o andar das baterias, com dois tratamentos e 12 repetições de dez aves (cinco machos e cinco fêmeas. Em um tratamento, efetuou-se a substituição de 40% da ração-referência por farelo de soja, enquanto no outro, além dessa substituição, ajustaram-se as quantidades de cloreto de colina e dos premixes de vitaminas e microminerais com base na ração-referência. Os valores médios e os respectivos erros-padrão para EMA e EMAn (kcal/kg do farelo de soja, com base na matéria natural, foram 2.462±29,62 e 2.269±25,80 para ração ajustada e 2.353±26,18 e 2.191±23,88 para ração não ajustada. O ajuste das quantidades de cloreto de colina e do premix de vitaminas e microminerais na ração-teste propiciou maiores valores de EMA e EMAn do farelo de soja em relação à ração não-ajustada. É importante ajustar as quantidades de vitaminas e microminerais nas rações-teste em experimentos visando determinar a energia metabolizável de ingredientes para aves.The objective of this study was to evaluate the effects of vitamin and micromineral levels adjustment in the test diet on the total collection method for determination of apparent metabolizable energy (EMA and apparent metabolizable energy corrected by nitrogen retention
Exigências de proteína bruta e energia metabolizável para codornas japonesas machos criadas para a produção de carne Energy and protein requirements for male Japanese quails reared for meat production

Directory of Open Access Journals (Sweden)

N.T.E. Oliveira

2002-04-01

Full Text Available O experimento objetivou estimar as exigências de proteína bruta (PB e energia metabolizável (EM para máximo desempenho de machos de codornas japonesas criadas para a produção de carne e determinação da idade de abate que resultasse em peso máximo das aves. Quatrocentos e cinqüenta codornas foram utilizadas em delineamento experimental de blocos ao acaso, com cinco repetições de seis codornas por unidade experimental. Os tratamentos, de dietas formuladas a partir da combinação de cinco níveis de PB (18, 20, 22, 24 e 26% e três níveis de EM (2800, 3000 e 3200kcal/kg de ração, foram alocados nas parcelas e os quatro períodos experimentais nas sub-parcelas. As variáveis estudadas durante os quatro períodos (5 a 16; 16 a 27; 27 a 38 e 38 a 49 dias de idade foram consumo de ração (CR, ganho de peso (GP, conversão alimentar (CA e peso vivo (PV. No período total (5 a 49 dias de idade foram estudados consumo de ração acumulado (CRA, ganho de peso acumulado (GPA e conversão alimentar acumulada (CAA. As exigências estimadas de PB e EM durante o primeiro (5 a 16 e terceiro (27 a 38 períodos foram 26 e 2800 e 18% e 3200kcal/kg de ração, respectivamente. A exigência de PB no segundo período (16 a 27 dias de idade foi de 26%, não sendo possível estimar a exigência energética. No quarto (38 a 49 dias de idade período experimental a exigência estimada de EM foi de 3200kcal/kg de ração, não sendo possível estimar a exigência protéica. No período total as exigências protéica e energética, estimadas para máximos ganhos de peso acumulados, foram de 26% e 3200kcal/kg de ração, respectivamente. As idades estimadas que resultaram em PV máximo dependeram do nível de PB e de EM da dieta, variando de 57 a 85 e 60 a 74 dias, respectivamente.The experiment aimed to estimate the crude protein (CP and metabolizable energy (ME requirements for maximum performance of male Japanese quails reared for meat production purpose
Supervised maximum-likelihood weighting of composite protein networks for complex prediction

Directory of Open Access Journals (Sweden)

Yong Chern Han

2012-12-01

Full Text Available Abstract Background Protein complexes participate in many important cellular functions, so finding the set of existent complexes is essential for understanding the organization and regulation of processes in the cell. With the availability of large amounts of high-throughput protein-protein interaction (PPI data, many algorithms have been proposed to discover protein complexes from PPI networks. However, such approaches are hindered by the high rate of noise in high-throughput PPI data, including spurious and missing interactions. Furthermore, many transient interactions are detected between proteins that are not from the same complex, while not all proteins from the same complex may actually interact. As a result, predicted complexes often do not match true complexes well, and many true complexes go undetected. Results We address these challenges by integrating PPI data with other heterogeneous data sources to construct a composite protein network, and using a supervised maximum-likelihood approach to weight each edge based on its posterior probability of belonging to a complex. We then use six different clustering algorithms, and an aggregative clustering strategy, to discover complexes in the weighted network. We test our method on Saccharomyces cerevisiae and Homo sapiens, and show that complex discovery is improved: compared to previously proposed supervised and unsupervised weighting approaches, our method recalls more known complexes, achieves higher precision at all recall levels, and generates novel complexes of greater functional similarity. Furthermore, our maximum-likelihood approach allows learned parameters to be used to visualize and evaluate the evidence of novel predictions, aiding human judgment of their credibility. Conclusions Our approach integrates multiple data sources with supervised learning to create a weighted composite protein network, and uses six clustering algorithms with an aggregative clustering strategy to
Multi-level machine learning prediction of protein–protein interactions in Saccharomyces cerevisiae

Directory of Open Access Journals (Sweden)

Julian Zubek

2015-07-01

Full Text Available Accurate identification of protein–protein interactions (PPI is the key step in understanding proteins’ biological functions, which are typically context-dependent. Many existing PPI predictors rely on aggregated features from protein sequences, however only a few methods exploit local information about specific residue contacts. In this work we present a two-stage machine learning approach for prediction of protein–protein interactions. We start with the carefully filtered data on protein complexes available for Saccharomyces cerevisiae in the Protein Data Bank (PDB database. First, we build linear descriptions of interacting and non-interacting sequence segment pairs based on their inter-residue distances. Secondly, we train machine learning classifiers to predict binary segment interactions for any two short sequence fragments. The final prediction of the protein–protein interaction is done using the 2D matrix representation of all-against-all possible interacting sequence segments of both analysed proteins. The level-I predictor achieves 0.88 AUC for micro-scale, i.e., residue-level prediction. The level-II predictor improves the results further by a more complex learning paradigm. We perform 30-fold macro-scale, i.e., protein-level cross-validation experiment. The level-II predictor using PSIPRED-predicted secondary structure reaches 0.70 precision, 0.68 recall, and 0.70 AUC, whereas other popular methods provide results below 0.6 threshold (recall, precision, AUC. Our results demonstrate that multi-scale sequence features aggregation procedure is able to improve the machine learning results by more than 10% as compared to other sequence representations. Prepared datasets and source code for our experimental pipeline are freely available for download from: http://zubekj.github.io/mlppi/ (open source Python implementation, OS independent.
NetTurnP – Neural Network Prediction of Beta-turns by Use of Evolutionary Information and Predicted Protein Sequence Features

DEFF Research Database (Denmark)

Petersen, Bent; Lundegaard, Claus; Petersen, Thomas Nordahl

2010-01-01

is the highest reported performance on a two-class prediction of β-turn and not-β-turn. Furthermore NetTurnP shows improved performance on some of the specific β-turn types. In the present work, neural network methods have been trained to predict β-turn or not and individual β-turn types from the primary amino......β-turns are the most common type of non-repetitive structures, and constitute on average 25% of the amino acids in proteins. The formation of β-turns plays an important role in protein folding, protein stability and molecular recognition processes. In this work we present the neural network method...... NetTurnP, for prediction of two-class β-turns and prediction of the individual β-turn types, by use of evolutionary information and predicted protein sequence features. It has been evaluated against a commonly used dataset BT426, and achieves a Matthews correlation coefficient of 0.50, which...
An Improved Method of Predicting Extinction Coefficients for the Determination of Protein Concentration.

Science.gov (United States)

Hilario, Eric C; Stern, Alan; Wang, Charlie H; Vargas, Yenny W; Morgan, Charles J; Swartz, Trevor E; Patapoff, Thomas W

2017-01-01

Concentration determination is an important method of protein characterization required in the development of protein therapeutics. There are many known methods for determining the concentration of a protein solution, but the easiest to implement in a manufacturing setting is absorption spectroscopy in the ultraviolet region. For typical proteins composed of the standard amino acids, absorption at wavelengths near 280 nm is due to the three amino acid chromophores tryptophan, tyrosine, and phenylalanine in addition to a contribution from disulfide bonds. According to the Beer-Lambert law, absorbance is proportional to concentration and path length, with the proportionality constant being the extinction coefficient. Typically the extinction coefficient of proteins is experimentally determined by measuring a solution absorbance then experimentally determining the concentration, a measurement with some inherent variability depending on the method used. In this study, extinction coefficients were calculated based on the measured absorbance of model compounds of the four amino acid chromophores. These calculated values for an unfolded protein were then compared with an experimental concentration determination based on enzymatic digestion of proteins. The experimentally determined extinction coefficient for the native proteins was consistently found to be 1.05 times the calculated value for the unfolded proteins for a wide range of proteins with good accuracy and precision under well-controlled experimental conditions. The value of 1.05 times the calculated value was termed the predicted extinction coefficient. Statistical analysis shows that the differences between predicted and experimentally determined coefficients are scattered randomly, indicating no systematic bias between the values among the proteins measured. The predicted extinction coefficient was found to be accurate and not subject to the inherent variability of experimental methods. We propose the use of a
Performance and carcass characteristics of steers fed with two levels of metabolizable energy intake during summer and winter season.

Science.gov (United States)

Arias, R A; Keim, J P; Gandarillas, M; Velásquez, A; Alvarado-Gilis, C; Mader, T L

2018-05-22

Climate change is producing an increase on extreme weather events around the world such as flooding, drought and extreme ambient temperatures impacting animal production and animal welfare. At present, there is a lack of studies addressing the effects of climatic conditions associated with energy intake in finishing cattle in South American feed yards. Therefore, two experiments were conducted to assess the effects of environmental variables and level of metabolizable energy intake above maintenance requirements (MEI) on performance and carcass quality of steers. In each experiment (winter and summer), steers were fed with 1.85 or 2.72 times of their requirements of metabolizable energy of maintenance. A total of 24 crossbred steers per experiment were used and located in four pens (26.25 m2/head) equipped with a Calan Broadbent Feeding System. Animals were fed with the same diet within each season, varying the amount offered to adjust the MEI treatments. Mud depth, mud scores, tympanic temperature (TT), environmental variables, average daily gain, respiration rates and carcass characteristics plus three thermal comfort indices were collected. Data analysis considered a factorial arrangement (Season and MEI). In addition, a repeated measures analysis was performed for TT and respiration rate. Mean values of ambient temperature, solar radiation and comfort thermal indices were greater in the summer experiment as expected (Pcarcass characteristics were affected by season but not by the level of MEI. Finally, due to the high variability of data as well as the small number of animals assessed in these experiments, more studies on carcass characteristics under similar conditions are required.
Predicting protein subcellular locations using hierarchical ensemble of Bayesian classifiers based on Markov chains

Directory of Open Access Journals (Sweden)

Eils Roland

2006-06-01

Full Text Available Abstract Background The subcellular location of a protein is closely related to its function. It would be worthwhile to develop a method to predict the subcellular location for a given protein when only the amino acid sequence of the protein is known. Although many efforts have been made to predict subcellular location from sequence information only, there is the need for further research to improve the accuracy of prediction. Results A novel method called HensBC is introduced to predict protein subcellular location. HensBC is a recursive algorithm which constructs a hierarchical ensemble of classifiers. The classifiers used are Bayesian classifiers based on Markov chain models. We tested our method on six various datasets; among them are Gram-negative bacteria dataset, data for discriminating outer membrane proteins and apoptosis proteins dataset. We observed that our method can predict the subcellular location with high accuracy. Another advantage of the proposed method is that it can improve the accuracy of the prediction of some classes with few sequences in training and is therefore useful for datasets with imbalanced distribution of classes. Conclusion This study introduces an algorithm which uses only the primary sequence of a protein to predict its subcellular location. The proposed recursive scheme represents an interesting methodology for learning and combining classifiers. The method is computationally efficient and competitive with the previously reported approaches in terms of prediction accuracies as empirical results indicate. The code for the software is available upon request.
Predicting Protein Secondary Structure with Markov Models

DEFF Research Database (Denmark)

Fischer, Paul; Larsen, Simon; Thomsen, Claus

2004-01-01

we are considering here, is to predict the secondary structure from the primary one. To this end we train a Markov model on training data and then use it to classify parts of unknown protein sequences as sheets, helices or coils. We show how to exploit the directional information contained...... in the Markov model for this task. Classifications that are purely based on statistical models might not always be biologically meaningful. We present combinatorial methods to incorporate biological background knowledge to enhance the prediction performance....

Systematic Prediction of Scaffold Proteins Reveals New Design Principles in Scaffold-Mediated Signal Transduction

Science.gov (United States)

Hu, Jianfei; Neiswinger, Johnathan; Zhang, Jin; Zhu, Heng; Qian, Jiang

2015-01-01

Scaffold proteins play a crucial role in facilitating signal transduction in eukaryotes by bringing together multiple signaling components. In this study, we performed a systematic analysis of scaffold proteins in signal transduction by integrating protein-protein interaction and kinase-substrate relationship networks. We predicted 212 scaffold proteins that are involved in 605 distinct signaling pathways. The computational prediction was validated using a protein microarray-based approach. The predicted scaffold proteins showed several interesting characteristics, as we expected from the functionality of scaffold proteins. We found that the scaffold proteins are likely to interact with each other, which is consistent with previous finding that scaffold proteins tend to form homodimers and heterodimers. Interestingly, a single scaffold protein can be involved in multiple signaling pathways by interacting with other scaffold protein partners. Furthermore, we propose two possible regulatory mechanisms by which the activity of scaffold proteins is coordinated with their associated pathways through phosphorylation process. PMID:26393507
Correlation of chemical shifts predicted by molecular dynamics simulations for partially disordered proteins

Energy Technology Data Exchange (ETDEWEB)

Karp, Jerome M.; Erylimaz, Ertan; Cowburn, David, E-mail: cowburn@cowburnlab.org, E-mail: David.cowburn@einstein.yu.edu [Albert Einstein College of Medicine of Yeshiva University, Department of Biochemistry (United States)

2015-01-15

There has been a longstanding interest in being able to accurately predict NMR chemical shifts from structural data. Recent studies have focused on using molecular dynamics (MD) simulation data as input for improved prediction. Here we examine the accuracy of chemical shift prediction for intein systems, which have regions of intrinsic disorder. We find that using MD simulation data as input for chemical shift prediction does not consistently improve prediction accuracy over use of a static X-ray crystal structure. This appears to result from the complex conformational ensemble of the disordered protein segments. We show that using accelerated molecular dynamics (aMD) simulations improves chemical shift prediction, suggesting that methods which better sample the conformational ensemble like aMD are more appropriate tools for use in chemical shift prediction for proteins with disordered regions. Moreover, our study suggests that data accurately reflecting protein dynamics must be used as input for chemical shift prediction in order to correctly predict chemical shifts in systems with disorder.
Age at puberty, ovulation rate, and reproductive tract traits of developing gilts fed two lysine levels and three metabolizable energy levels from 100 to 260 d of age

Science.gov (United States)

The objective of this study was to determine the effect of feeding different lysine and metabolizable energy (ME) levels to developing gilts on age at puberty and reproductive tract measurements. Crossbred Large White × Landrace gilts (n = 1221) housed in groups from 100 d of age until slaughter (ap...
Effects of decreasing metabolizable protein and rumen-undegradable protein on milk production and composition and blood metabolites of Holstein dairy cows in early lactation.

Science.gov (United States)

Bahrami-Yekdangi, H; Khorvash, M; Ghorbani, G R; Alikhani, M; Jahanian, R; Kamalian, E

2014-01-01

This study was conducted to evaluate the effects of decreasing dietary protein and rumen-undegradable protein (RUP) on production performance, nitrogen retention, and nutrient digestibility in high-producing Holstein cows in early lactation. Twelve multiparous Holstein lactating cows (2 lactations; 50 ± 7 d in milk; 47 kg/d of milk production) were used in a Latin square design with 4 treatments and 3 replicates (cows). Treatments 1 to 4 consisted of diets containing 18, 17.2, 16.4, and 15.6% crude protein (CP), respectively, with the 18% CP diet considered the control group. Rumen-degradable protein levels were constant across the treatments (approximately 10.9% on a dry matter basis), whereas RUP was gradually decreased. All diets were calculated to supply a postruminal Lys:Met ratio of about 3:1. Dietary CP had no significant effects on milk production or milk composition. In fact, 16.4% dietary CP compared with 18% dietary CP led to higher milk production; however, this effect was not significant. Feed intake was higher for 16.4% CP than for 18% CP (25.7 vs. 24.3 kg/d). Control cows had greater CP and RUP intakes, which resulted in higher concentrations of plasma urea nitrogen and milk urea nitrogen; cows receiving 16.4 and 15.6% CP, respectively, exhibited lower concentrations of milk urea nitrogen (15.2 and 15.1 vs. 17.3 mg/dL). The control diet had a significant effect on predicted urinary N. Higher CP digestibility was recorded for 18% CP compared with the other diets. Decreasing CP and RUP to 15.6 and 4.6% of dietary dry matter, respectively, had no negative effects on milk production or composition when the amounts of Lys and Met and the Lys:Met ratio were balanced. Furthermore, decreasing CP and RUP to 16.4 and 5.4%, respectively, increased dry matter intake. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Insoluble glycogen, a metabolizable internal adsorbent, decreases the lethality of endotoxin shock in rats

Directory of Open Access Journals (Sweden)

S. Sipka

1997-01-01

Full Text Available Insoluble glycogen is an enzymatically modified form of naturally occurring soluble glycogen with a great adsorbing capacity. It can be metabolized by phagocytes to glucose. In this study we used insoluble glycogen intravenously in the experimental endotoxin shock of rats. Wistar male rats were sensitized to endotoxin by Pb acetate. The survival of rats were compared in groups of animals endotoxin shock treated and non-treated with insoluble glycogen. Furthermore, we have determined in vitro the binding capacity of insoluble glycogen for endotoxin, tumour necrosis factor alpha, interleukin-1 and secretable phospholipase A2. Use of 10 mg/kg dose of insoluble glycogen could completely prevent the lethality of shock induced by LD50 quantity of endotoxin in rats. All animals treated survived. Insoluble glycogen is a form of ‘metabolizable internal adsorbents’. It can potentially be used for treatment of septic shock.
Sequence-based prediction of protein-binding sites in DNA: comparative study of two SVM models.

Science.gov (United States)

Park, Byungkyu; Im, Jinyong; Tuvshinjargal, Narankhuu; Lee, Wook; Han, Kyungsook

2014-11-01

As many structures of protein-DNA complexes have been known in the past years, several computational methods have been developed to predict DNA-binding sites in proteins. However, its inverse problem (i.e., predicting protein-binding sites in DNA) has received much less attention. One of the reasons is that the differences between the interaction propensities of nucleotides are much smaller than those between amino acids. Another reason is that DNA exhibits less diverse sequence patterns than protein. Therefore, predicting protein-binding DNA nucleotides is much harder than predicting DNA-binding amino acids. We computed the interaction propensity (IP) of nucleotide triplets with amino acids using an extensive dataset of protein-DNA complexes, and developed two support vector machine (SVM) models that predict protein-binding nucleotides from sequence data alone. One SVM model predicts protein-binding nucleotides using DNA sequence data alone, and the other SVM model predicts protein-binding nucleotides using both DNA and protein sequences. In a 10-fold cross-validation with 1519 DNA sequences, the SVM model that uses DNA sequence data only predicted protein-binding nucleotides with an accuracy of 67.0%, an F-measure of 67.1%, and a Matthews correlation coefficient (MCC) of 0.340. With an independent dataset of 181 DNAs that were not used in training, it achieved an accuracy of 66.2%, an F-measure 66.3% and a MCC of 0.324. Another SVM model that uses both DNA and protein sequences achieved an accuracy of 69.6%, an F-measure of 69.6%, and a MCC of 0.383 in a 10-fold cross-validation with 1519 DNA sequences and 859 protein sequences. With an independent dataset of 181 DNAs and 143 proteins, it showed an accuracy of 67.3%, an F-measure of 66.5% and a MCC of 0.329. Both in cross-validation and independent testing, the second SVM model that used both DNA and protein sequence data showed better performance than the first model that used DNA sequence data. To the best of
Prediction of the metabolizable energy requirements of free-range laying hens.

Science.gov (United States)

Brainer, M M A; Rabello, C B V; Santos, M J B; Lopes, C C; Ludke, J V; Silva, J H V; Lima, R A

2016-01-01

This experiment was conducted with the aim of estimating the ME requirements of free-range laying hens for maintenance, weight gain, and egg production. These experiments were performed to develop an energy requirement prediction equation by using the comparative slaughter technique and the total excreta collection method. Regression equations were used to relate the energy intake, the energy retained in the body and eggs, and the heat production of the hens. These relationships were used to determine the daily ME requirement for maintenance, the efficiency energy utilization above the requirements for maintenance, and the NE requirement for maintenance. The requirement for weight gain was estimated from the energy content of the carcass, and the diet's efficiency energy utilization was determined from the weight gain, which was measured during weekly slaughter. The requirement for egg production was estimated by considering the energy content of the eggs and the efficiency of energy deposition in the eggs. The requirement and efficiency energy utilization for maintenance were 121.8 kcal ME/(kg∙d)and 0.68, respectively. Similarly, the NE requirement for maintenance was 82.4 kcal ME/(kg∙d), and the efficiency energy utilization above maintenance was 0.61. Because the carcass body weight and energy did not increase during the trial, the weight gain could not be estimated. The requirements for egg production requirement and efficiency energy utilization for egg production were 2.48 kcal/g and 0.61, respectively. The following energy prediction equation for free-range laying hens (without weight gain) was developed: ME /(hen ∙ d) = 121.8 × W + 2.48 × EM, in which W = body weight (kg) and EM = egg mass (g/[hen ∙ d]).
The Use of Sweet Almond Meal as a Protein Source in Japanese Quails Diets

Directory of Open Access Journals (Sweden)

Arjomandi MA

2015-12-01

Full Text Available In the first experiment, the chemical composition, apparent metabolizable energy (AME, AME corrected for nitrogen (AMEn, true metabolizable energy (TME, TME corrected for nitrogen (TMEn values of the sweet almond meal were determined in adult Leghorn cockerels. The second experiment was performed to evaluate the effects of different levels of sweet almond meal at 0, 100, 200 and 300 g/kg on Japanese quail's growth performance, some blood metabolites, relative weight of different organs, meat quality and egg yolk cholesterol in a completely randomized design with 288 Japanese quails including 4 treatments, 4 replicates and 18 birds per replicate. The metabolizable energy values of sweet almond meal were following: AME = 3734, AMEn = 3648, TME = 3908, TMEn = 3746 kcal/kg as fed basis. Feed intake, feed conversion ratio and live weight gain and relative weight of different organs in the birds fed diets with different levels of the sweet almond meal were not statistically different from control. A sweet almond meal at 300 g/kg level showed the lower serum total cholesterol and low-density lipoprotein (P < 0.05 compared to control and 100 g/kg sweet almond meal. Dietary treatments had no significant effect on the total cholesterol content of quail's eggs. Malondialdehyde concentration in breast meat samples after 40 days freezing decreased, whereas the level of sweet almond meal increased (P < 0.01. In general, a sweet almond meal without any adverse effect on growth performance is a good source of energy and protein and can be used up to 300 g/kg of the Japanese quail diets.
MUFOLD-SS: New deep inception-inside-inception networks for protein secondary structure prediction.

Science.gov (United States)

Fang, Chao; Shang, Yi; Xu, Dong

2018-05-01

Protein secondary structure prediction can provide important information for protein 3D structure prediction and protein functions. Deep learning offers a new opportunity to significantly improve prediction accuracy. In this article, a new deep neural network architecture, named the Deep inception-inside-inception (Deep3I) network, is proposed for protein secondary structure prediction and implemented as a software tool MUFOLD-SS. The input to MUFOLD-SS is a carefully designed feature matrix corresponding to the primary amino acid sequence of a protein, which consists of a rich set of information derived from individual amino acid, as well as the context of the protein sequence. Specifically, the feature matrix is a composition of physio-chemical properties of amino acids, PSI-BLAST profile, and HHBlits profile. MUFOLD-SS is composed of a sequence of nested inception modules and maps the input matrix to either eight states or three states of secondary structures. The architecture of MUFOLD-SS enables effective processing of local and global interactions between amino acids in making accurate prediction. In extensive experiments on multiple datasets, MUFOLD-SS outperformed the best existing methods and other deep neural networks significantly. MUFold-SS can be downloaded from http://dslsrv8.cs.missouri.edu/~cf797/MUFoldSS/download.html. © 2018 Wiley Periodicals, Inc.
Prediction of human protein function according to Gene Ontology categories

DEFF Research Database (Denmark)

Jensen, Lars Juhl; Gupta, Ramneek; Stærfeldt, Hans Henrik

2003-01-01

developed a method for prediction of protein function for a subset of classes from the Gene Ontology classification scheme. This subset includes several pharmaceutically interesting categories-transcription factors, receptors, ion channels, stress and immune response proteins, hormones and growth factors...
The Phyre2 web portal for protein modeling, prediction and analysis.

Science.gov (United States)

Kelley, Lawrence A; Mezulis, Stefans; Yates, Christopher M; Wass, Mark N; Sternberg, Michael J E

2015-06-01

Phyre2 is a suite of tools available on the web to predict and analyze protein structure, function and mutations. The focus of Phyre2 is to provide biologists with a simple and intuitive interface to state-of-the-art protein bioinformatics tools. Phyre2 replaces Phyre, the original version of the server for which we previously published a paper in Nature Protocols. In this updated protocol, we describe Phyre2, which uses advanced remote homology detection methods to build 3D models, predict ligand binding sites and analyze the effect of amino acid variants (e.g., nonsynonymous SNPs (nsSNPs)) for a user's protein sequence. Users are guided through results by a simple interface at a level of detail they determine. This protocol will guide users from submitting a protein sequence to interpreting the secondary and tertiary structure of their models, their domain composition and model quality. A range of additional available tools is described to find a protein structure in a genome, to submit large number of sequences at once and to automatically run weekly searches for proteins that are difficult to model. The server is available at http://www.sbg.bio.ic.ac.uk/phyre2. A typical structure prediction will be returned between 30 min and 2 h after submission.
Predicting protein structures with a multiplayer online game

OpenAIRE

Cooper, Seth; Khatib, Firas; Treuille, Adrien; Barbero, Janos; Lee, Jeehyung; Beenen, Michael; Leaver-Fay, Andrew; Baker, David; Popović, Zoran

2010-01-01

People exert significant amounts of problem solving effort playing computer games. Simple image- and text-recognition tasks have been successfully crowd-sourced through gamesi, ii, iii, but it is not clear if more complex scientific problems can be similarly solved with human-directed computing. Protein structure prediction is one such problem: locating the biologically relevant native conformation of a protein is a formidable computational challenge given the very large size of the search sp...
Pengaruh pemberian aditif cair buah naga merah (Hylocereus polyrhizus terhadap kecernaan proein, energi metabolis dan produksi telur burung puyuh

Directory of Open Access Journals (Sweden)

Rika Dwi Astuti

2015-12-01

Full Text Available The study was aimed to evaluate the effect of liquid additive red dragon fruit as the addition of drinking water on the digestibility of protein, metabolizable energy and the production of quail eggs. Experimental animals used in the research were 200 female quails, 7 day old with average body weight of 13.61 ± 0.49 g. The experiment used a completely randomized design with 4 treatments and 5 replications : T0 (control, T1 (addition of a liquid additive red dragon fruit about 5 ml twice a day, T2 (once a day and T3 (two days on time. The parameters measured were feed intake, digestibility of protein, metabolizable energy and production of quail eggs. Data were analyzed using a variety of F test at the level 5%, followed by Duncan’s Multiple Range test when there are significant effects on the treatment. The results showed that liquid additives red dragon fruit was not significant (P>0.05 on the digestibility of protein, metabolizable energy and the production of quail eggs. In conclusion, the adition of liquid additives reddragon fruit did not increase digestibility of protein, metabolizable energy and the production of quail eggs. Keywords: digestibility of crude protein, quail, quail egg production, red dragon fruit
System and methods for predicting transmembrane domains in membrane proteins and mining the genome for recognizing G-protein coupled receptors

Science.gov (United States)

Trabanino, Rene J; Vaidehi, Nagarajan; Hall, Spencer E; Goddard, William A; Floriano, Wely

2013-02-05

The invention provides computer-implemented methods and apparatus implementing a hierarchical protocol using multiscale molecular dynamics and molecular modeling methods to predict the presence of transmembrane regions in proteins, such as G-Protein Coupled Receptors (GPCR), and protein structural models generated according to the protocol. The protocol features a coarse grain sampling method, such as hydrophobicity analysis, to provide a fast and accurate procedure for predicting transmembrane regions. Methods and apparatus of the invention are useful to screen protein or polynucleotide databases for encoded proteins with transmembrane regions, such as GPCRs.
Multi-Label Learning via Random Label Selection for Protein Subcellular Multi-Locations Prediction.

Science.gov (United States)

Wang, Xiao; Li, Guo-Zheng

2013-03-12

Prediction of protein subcellular localization is an important but challenging problem, particularly when proteins may simultaneously exist at, or move between, two or more different subcellular location sites. Most of the existing protein subcellular localization methods are only used to deal with the single-location proteins. In the past few years, only a few methods have been proposed to tackle proteins with multiple locations. However, they only adopt a simple strategy, that is, transforming the multi-location proteins to multiple proteins with single location, which doesn't take correlations among different subcellular locations into account. In this paper, a novel method named RALS (multi-label learning via RAndom Label Selection), is proposed to learn from multi-location proteins in an effective and efficient way. Through five-fold cross validation test on a benchmark dataset, we demonstrate our proposed method with consideration of label correlations obviously outperforms the baseline BR method without consideration of label correlations, indicating correlations among different subcellular locations really exist and contribute to improvement of prediction performance. Experimental results on two benchmark datasets also show that our proposed methods achieve significantly higher performance than some other state-of-the-art methods in predicting subcellular multi-locations of proteins. The prediction web server is available at http://levis.tongji.edu.cn:8080/bioinfo/MLPred-Euk/ for the public usage.
MemBrain: An Easy-to-Use Online Webserver for Transmembrane Protein Structure Prediction

Science.gov (United States)

Yin, Xi; Yang, Jing; Xiao, Feng; Yang, Yang; Shen, Hong-Bin

2018-03-01

Membrane proteins are an important kind of proteins embedded in the membranes of cells and play crucial roles in living organisms, such as ion channels, transporters, receptors. Because it is difficult to determinate the membrane protein's structure by wet-lab experiments, accurate and fast amino acid sequence-based computational methods are highly desired. In this paper, we report an online prediction tool called MemBrain, whose input is the amino acid sequence. MemBrain consists of specialized modules for predicting transmembrane helices, residue-residue contacts and relative accessible surface area of α-helical membrane proteins. MemBrain achieves a prediction accuracy of 97.9% of A TMH, 87.1% of A P, 3.2 ± 3.0 of N-score, 3.1 ± 2.8 of C-score. MemBrain-Contact obtains 62%/64.1% prediction accuracy on training and independent dataset on top L/5 contact prediction, respectively. And MemBrain-Rasa achieves Pearson correlation coefficient of 0.733 and its mean absolute error of 13.593. These prediction results provide valuable hints for revealing the structure and function of membrane proteins. MemBrain web server is free for academic use and available at www.csbio.sjtu.edu.cn/bioinf/MemBrain/. [Figure not available: see fulltext.
MU-LOC: A Machine-Learning Method for Predicting Mitochondrially Localized Proteins in Plants

DEFF Research Database (Denmark)

Zhang, Ning; Rao, R Shyama Prasad; Salvato, Fernanda

2018-01-01

-sequence or a multitude of internal signals. Compared with experimental approaches, computational predictions provide an efficient way to infer subcellular localization of a protein. However, it is still challenging to predict plant mitochondrially localized proteins accurately due to various limitations. Consequently......, the performance of current tools can be improved with new data and new machine-learning methods. We present MU-LOC, a novel computational approach for large-scale prediction of plant mitochondrial proteins. We collected a comprehensive dataset of plant subcellular localization, extracted features including amino...
Prediction of localization and interactions of apoptotic proteins

Directory of Open Access Journals (Sweden)

Matula Pavel

2009-07-01

Full Text Available Abstract During apoptosis several mitochondrial proteins are released. Some of them participate in caspase-independent nuclear DNA degradation, especially apoptosis-inducing factor (AIF and endonuclease G (endoG. Another interesting protein, which was expected to act similarly as AIF due to the high sequence homology with AIF is AIF-homologous mitochondrion-associated inducer of death (AMID. We studied the structure, cellular localization, and interactions of several proteins in silico and also in cells using fluorescent microscopy. We found the AMID protein to be cytoplasmic, most probably incorporated into the cytoplasmic side of the lipid membranes. Bioinformatic predictions were conducted to analyze the interactions of the studied proteins with each other and with other possible partners. We conducted molecular modeling of proteins with unknown 3D structures. These models were then refined by MolProbity server and employed in molecular docking simulations of interactions. Our results show data acquired using a combination of modern in silico methods and image analysis to understand the localization, interactions and functions of proteins AMID, AIF, endonuclease G, and other apoptosis-related proteins.
Genome-scale prediction of proteins with long intrinsically disordered regions.

Science.gov (United States)

Peng, Zhenling; Mizianty, Marcin J; Kurgan, Lukasz

2014-01-01

Proteins with long disordered regions (LDRs), defined as having 30 or more consecutive disordered residues, are abundant in eukaryotes, and these regions are recognized as a distinct class of biologically functional domains. LDRs facilitate various cellular functions and are important for target selection in structural genomics. Motivated by the lack of methods that directly predict proteins with LDRs, we designed Super-fast predictor of proteins with Long Intrinsically DisordERed regions (SLIDER). SLIDER utilizes logistic regression that takes an empirically chosen set of numerical features, which consider selected physicochemical properties of amino acids, sequence complexity, and amino acid composition, as its inputs. Empirical tests show that SLIDER offers competitive predictive performance combined with low computational cost. It outperforms, by at least a modest margin, a comprehensive set of modern disorder predictors (that can indirectly predict LDRs) and is 16 times faster compared to the best currently available disorder predictor. Utilizing our time-efficient predictor, we characterized abundance and functional roles of proteins with LDRs over 110 eukaryotic proteomes. Similar to related studies, we found that eukaryotes have many (on average 30.3%) proteins with LDRs with majority of proteomes having between 25 and 40%, where higher abundance is characteristic to proteomes that have larger proteins. Our first-of-its-kind large-scale functional analysis shows that these proteins are enriched in a number of cellular functions and processes including certain binding events, regulation of catalytic activities, cellular component organization, biogenesis, biological regulation, and some metabolic and developmental processes. A webserver that implements SLIDER is available at http://biomine.ece.ualberta.ca/SLIDER/. Copyright © 2013 Wiley Periodicals, Inc.
Three-dimensional protein structure prediction: Methods and computational strategies.

Science.gov (United States)

Dorn, Márcio; E Silva, Mariel Barbachan; Buriol, Luciana S; Lamb, Luis C

2014-10-12

A long standing problem in structural bioinformatics is to determine the three-dimensional (3-D) structure of a protein when only a sequence of amino acid residues is given. Many computational methodologies and algorithms have been proposed as a solution to the 3-D Protein Structure Prediction (3-D-PSP) problem. These methods can be divided in four main classes: (a) first principle methods without database information; (b) first principle methods with database information; (c) fold recognition and threading methods; and (d) comparative modeling methods and sequence alignment strategies. Deterministic computational techniques, optimization techniques, data mining and machine learning approaches are typically used in the construction of computational solutions for the PSP problem. Our main goal with this work is to review the methods and computational strategies that are currently used in 3-D protein prediction. Copyright © 2014 Elsevier Ltd. All rights reserved.

CaMELS: In silico prediction of calmodulin binding proteins and their binding sites.

Science.gov (United States)

Abbasi, Wajid Arshad; Asif, Amina; Andleeb, Saiqa; Minhas, Fayyaz Ul Amir Afsar

2017-09-01

Due to Ca 2+ -dependent binding and the sequence diversity of Calmodulin (CaM) binding proteins, identifying CaM interactions and binding sites in the wet-lab is tedious and costly. Therefore, computational methods for this purpose are crucial to the design of such wet-lab experiments. We present an algorithm suite called CaMELS (CalModulin intEraction Learning System) for predicting proteins that interact with CaM as well as their binding sites using sequence information alone. CaMELS offers state of the art accuracy for both CaM interaction and binding site prediction and can aid biologists in studying CaM binding proteins. For CaM interaction prediction, CaMELS uses protein sequence features coupled with a large-margin classifier. CaMELS models the binding site prediction problem using multiple instance machine learning with a custom optimization algorithm which allows more effective learning over imprecisely annotated CaM-binding sites during training. CaMELS has been extensively benchmarked using a variety of data sets, mutagenic studies, proteome-wide Gene Ontology enrichment analyses and protein structures. Our experiments indicate that CaMELS outperforms simple motif-based search and other existing methods for interaction and binding site prediction. We have also found that the whole sequence of a protein, rather than just its binding site, is important for predicting its interaction with CaM. Using the machine learning model in CaMELS, we have identified important features of protein sequences for CaM interaction prediction as well as characteristic amino acid sub-sequences and their relative position for identifying CaM binding sites. Python code for training and evaluating CaMELS together with a webserver implementation is available at the URL: http://faculty.pieas.edu.pk/fayyaz/software.html#camels. © 2017 Wiley Periodicals, Inc.
Prediction of antigenic epitopes on protein surfaces by consensus scoring

Directory of Open Access Journals (Sweden)

Zhang Chi

2009-09-01

Full Text Available Abstract Background Prediction of antigenic epitopes on protein surfaces is important for vaccine design. Most existing epitope prediction methods focus on protein sequences to predict continuous epitopes linear in sequence. Only a few structure-based epitope prediction algorithms are available and they have not yet shown satisfying performance. Results We present a new antigen Epitope Prediction method, which uses ConsEnsus Scoring (EPCES from six different scoring functions - residue epitope propensity, conservation score, side-chain energy score, contact number, surface planarity score, and secondary structure composition. Applied to unbounded antigen structures from an independent test set, EPCES was able to predict antigenic eptitopes with 47.8% sensitivity, 69.5% specificity and an AUC value of 0.632. The performance of the method is statistically similar to other published methods. The AUC value of EPCES is slightly higher compared to the best results of existing algorithms by about 0.034. Conclusion Our work shows consensus scoring of multiple features has a better performance than any single term. The successful prediction is also due to the new score of residue epitope propensity based on atomic solvent accessibility.
Foetal life protein restriction in male mink (Neovison vison) kits lowers post-weaning protein oxidation and the relative abundance of hepatic fructose-1,6-bisphosphatase mRNA

DEFF Research Database (Denmark)

Matthiesen, Connie Marianne Frank; Blache, D.; Thomsen, Preben Dybdahl

2012-01-01

Foetal life malnutrition has been studied intensively in a number of animal models. Results show that especially foetal life protein malnutrition can lead to metabolic changes later in life. This might be of particular importance for strict carnivores, for example, cat and mink (Neovison vison...... born to mothers fed either a low-protein diet (LP), that is, 14% of metabolizable energy (ME) from protein (foetal low – FL), n = 16, or an adequate-protein (AP) diet, that is, 29% of ME from protein (foetal adequate – FA), n = 16) in the last 16.3 ± 1.8 days of pregnancy were used. The FL offspring...... had lower birth weight and lower relative abundance of fructose-1,6-bisphosphatase (Fru-1,6-P2ase) and pyruvate kinase mRNA in foetal hepatic tissue than FA kits. The mothers were fed a diet containing adequate protein until weaning. At weaning (7 weeks of age), half of the kits from each foetal...
Defining the predicted protein secretome of the fungal wheat leaf pathogen Mycosphaerella graminicola.

Directory of Open Access Journals (Sweden)

Alexandre Morais do Amaral

Full Text Available The Dothideomycete fungus Mycosphaerella graminicola is the causal agent of Septoria tritici blotch, a devastating disease of wheat leaves that causes dramatic decreases in yield. Infection involves an initial extended period of symptomless intercellular colonisation prior to the development of visible necrotic disease lesions. Previous functional genomics and gene expression profiling studies have implicated the production of secreted virulence effector proteins as key facilitators of the initial symptomless growth phase. In order to identify additional candidate virulence effectors, we re-analysed and catalogued the predicted protein secretome of M. graminicola isolate IPO323, which is currently regarded as the reference strain for this species. We combined several bioinformatic approaches in order to increase the probability of identifying truly secreted proteins with either a predicted enzymatic function or an as yet unknown function. An initial secretome of 970 proteins was predicted, whilst further stringent selection criteria predicted 492 proteins. Of these, 321 possess some functional annotation, the composition of which may reflect the strictly intercellular growth habit of this pathogen, leaving 171 with no functional annotation. This analysis identified a protein family encoding secreted peroxidases/chloroperoxidases (PF01328 which is expanded within all members of the family Mycosphaerellaceae. Further analyses were done on the non-annotated proteins for size and cysteine content (effector protein hallmarks, and then by studying the distribution of homologues in 17 other sequenced Dothideomycete fungi within an overall total of 91 predicted proteomes from fungal, oomycete and nematode species. This detailed M. graminicola secretome analysis provides the basis for further functional and comparative genomics studies.
An automated decision-tree approach to predicting protein interaction hot spots.

Science.gov (United States)

Darnell, Steven J; Page, David; Mitchell, Julie C

2007-09-01

Protein-protein interactions can be altered by mutating one or more "hot spots," the subset of residues that account for most of the interface's binding free energy. The identification of hot spots requires a significant experimental effort, highlighting the practical value of hot spot predictions. We present two knowledge-based models that improve the ability to predict hot spots: K-FADE uses shape specificity features calculated by the Fast Atomic Density Evaluation (FADE) program, and K-CON uses biochemical contact features. The combined K-FADE/CON (KFC) model displays better overall predictive accuracy than computational alanine scanning (Robetta-Ala). In addition, because these methods predict different subsets of known hot spots, a large and significant increase in accuracy is achieved by combining KFC and Robetta-Ala. The KFC analysis is applied to the calmodulin (CaM)/smooth muscle myosin light chain kinase (smMLCK) interface, and to the bone morphogenetic protein-2 (BMP-2)/BMP receptor-type I (BMPR-IA) interface. The results indicate a strong correlation between KFC hot spot predictions and mutations that significantly reduce the binding affinity of the interface. 2007 Wiley-Liss, Inc.
Changes in predicted protein disorder tendency may contribute to disease risk

Directory of Open Access Journals (Sweden)

Hu Yang

2011-12-01

Full Text Available Abstract Background Recent studies suggest that many proteins or regions of proteins lack 3D structure. Defined as intrinsically disordered proteins, these proteins/peptides are functionally important. Recent advances in next generation sequencing technologies enable genome-wide identification of novel nucleotide variations in a specific population or cohort. Results Using the exonic single nucleotide variations (SNVs identified in the 1,000 Genomes Project and distributed by the Genetic Analysis Workshop 17, we systematically analysed the genetic and predicted disorder potential features of the non-synonymous variations. The result of experiments suggests that a significant change in the tendency of a protein region to be structured or disordered caused by SNVs may lead to malfunction of such a protein and contribute to disease risk. Conclusions After validation with functional SNVs on the traits distributed by GAW17, we conclude that it is valuable to consider structure/disorder tendencies while prioritizing and predicting mechanistic effects arising from novel genetic variations.
Sequence Based Prediction of Antioxidant Proteins Using a Classifier Selection Strategy.

Directory of Open Access Journals (Sweden)

Lina Zhang

Full Text Available Antioxidant proteins perform significant functions in maintaining oxidation/antioxidation balance and have potential therapies for some diseases. Accurate identification of antioxidant proteins could contribute to revealing physiological processes of oxidation/antioxidation balance and developing novel antioxidation-based drugs. In this study, an ensemble method is presented to predict antioxidant proteins with hybrid features, incorporating SSI (Secondary Structure Information, PSSM (Position Specific Scoring Matrix, RSA (Relative Solvent Accessibility, and CTD (Composition, Transition, Distribution. The prediction results of the ensemble predictor are determined by an average of prediction results of multiple base classifiers. Based on a classifier selection strategy, we obtain an optimal ensemble classifier composed of RF (Random Forest, SMO (Sequential Minimal Optimization, NNA (Nearest Neighbor Algorithm, and J48 with an accuracy of 0.925. A Relief combined with IFS (Incremental Feature Selection method is adopted to obtain optimal features from hybrid features. With the optimal features, the ensemble method achieves improved performance with a sensitivity of 0.95, a specificity of 0.93, an accuracy of 0.94, and an MCC (Matthew's Correlation Coefficient of 0.880, far better than the existing method. To evaluate the prediction performance objectively, the proposed method is compared with existing methods on the same independent testing dataset. Encouragingly, our method performs better than previous studies. In addition, our method achieves more balanced performance with a sensitivity of 0.878 and a specificity of 0.860. These results suggest that the proposed ensemble method can be a potential candidate for antioxidant protein prediction. For public access, we develop a user-friendly web server for antioxidant protein identification that is freely accessible at http://antioxidant.weka.cc.
QuaBingo: A Prediction System for Protein Quaternary Structure Attributes Using Block Composition

Directory of Open Access Journals (Sweden)

Chi-Hua Tung

2016-01-01

Full Text Available Background. Quaternary structures of proteins are closely relevant to gene regulation, signal transduction, and many other biological functions of proteins. In the current study, a new method based on protein-conserved motif composition in block format for feature extraction is proposed, which is termed block composition. Results. The protein quaternary assembly states prediction system which combines blocks with functional domain composition, called QuaBingo, is constructed by three layers of classifiers that can categorize quaternary structural attributes of monomer, homooligomer, and heterooligomer. The building of the first layer classifier uses support vector machines (SVM based on blocks and functional domains of proteins, and the second layer SVM was utilized to process the outputs of the first layer. Finally, the result is determined by the Random Forest of the third layer. We compared the effectiveness of the combination of block composition, functional domain composition, and pseudoamino acid composition of the model. In the 11 kinds of functional protein families, QuaBingo is 23% of Matthews Correlation Coefficient (MCC higher than the existing prediction system. The results also revealed the biological characterization of the top five block compositions. Conclusions. QuaBingo provides better predictive ability for predicting the quaternary structural attributes of proteins.
Weight and yield of non-carcass components of Morada Nova lambs fed with different levels of metabolizable energyPeso e rendimento dos componentes não-carcaça de ovinos Morada Nova alimentados com diferentes níveis de energia metabolizável

Directory of Open Access Journals (Sweden)

Danilo de Araújo Camilo

2012-12-01

Full Text Available The objective of this study was to evaluate the effect of different metabolizable energy (ME levels on weight of gastrointestinal content, weight and yield of the internal organs and gastrointestinal compartments of Morada Nova growing lambs. Thirty-two animals, non-castrated, with average body weight of 12.12 ± 1.69 kg and two months old approximately, were used. The animals were distributed into four different metabolizable energy (1.28; 1.72; 2.18 and 2.62 Mcal/kg DM levels, in randomized block design with eight replicates per treatment. Tifton 85 hay was used as roughage. There was no effect of energy levels (P > 0.05 on weight of gastrointestinal content. Increased linear effect (P Objetivou-se com o presente estudo avaliar o efeito de diferentes níveis de energia metabolizável (EM sobre: peso do conteúdo do trato gastrintestinal, peso e rendimento dos órgãos internos e compartimentos gastrintestinais em ovinos Morada Nova em crescimento. Foram utilizados 32 animais, não castrados, com peso corporal médio de 12,12 ± 1,69 kg e, aproximadamente, dois meses de idade. Os animais foram distribuídos em quatro diferentes níveis de EM (1,28; 1,72; 2,18 e 2,62 Mcal/kg de MS, em delineamento em blocos casualizados, com oito repetições por nível de EM. O feno de Tifton 85 foi utilizado como volumoso. Não foi observado efeito (P > 0,05 dos níveis de energia sobre o peso do conteúdo gastrintestinal. Verificou-se efeito linear crescente (P < 0,05 dos níveis de EM sobre os pesos do coração, PTEL (pulmões, traqueia, esôfago e língua, fígado e baço, expressos em kg. Em relação aos compartimentos do trato gastrintestinal foi observado efeito linear crescente (P < 0,05 dos níveis de EM somente sobre o rúmen-retículo, em %, e intestino delgado, em kg. As gorduras perirrenal, omental e mesentérica foram influenciadas pelos níveis de EM (P < 0,05, com incremento linear para os pesos em kg e %. O aumento dos níveis de EM das ra
Prediction of protein post-translational modifications: main trends and methods

Science.gov (United States)

Sobolev, B. N.; Veselovsky, A. V.; Poroikov, V. V.

2014-02-01

The review summarizes main trends in the development of methods for the prediction of protein post-translational modifications (PTMs) by considering the three most common types of PTMs — phosphorylation, acetylation and glycosylation. Considerable attention is given to general characteristics of regulatory interactions associated with PTMs. Different approaches to the prediction of PTMs are analyzed. Most of the methods are based only on the analysis of the neighbouring environment of modification sites. The related software is characterized by relatively low accuracy of PTM predictions, which may be due both to the incompleteness of training data and the features of PTM regulation. Advantages and limitations of the phylogenetic approach are considered. The prediction of PTMs using data on regulatory interactions, including the modular organization of interacting proteins, is a promising field, provided that a more carefully selected training data will be used. The bibliography includes 145 references.
Cascaded bidirectional recurrent neural networks for protein secondary structure prediction.

Science.gov (United States)

Chen, Jinmiao; Chaudhari, Narendra

2007-01-01

Protein secondary structure (PSS) prediction is an important topic in bioinformatics. Our study on a large set of non-homologous proteins shows that long-range interactions commonly exist and negatively affect PSS prediction. Besides, we also reveal strong correlations between secondary structure (SS) elements. In order to take into account the long-range interactions and SS-SS correlations, we propose a novel prediction system based on cascaded bidirectional recurrent neural network (BRNN). We compare the cascaded BRNN against another two BRNN architectures, namely the original BRNN architecture used for speech recognition as well as Pollastri's BRNN that was proposed for PSS prediction. Our cascaded BRNN achieves an overall three state accuracy Q3 of 74.38\\%, and reaches a high Segment OVerlap (SOV) of 66.0455. It outperforms the original BRNN and Pollastri's BRNN in both Q3 and SOV. Specifically, it improves the SOV score by 4-6%.
Parametric Bayesian priors and better choice of negative examples improve protein function prediction.

Science.gov (United States)

Youngs, Noah; Penfold-Brown, Duncan; Drew, Kevin; Shasha, Dennis; Bonneau, Richard

2013-05-01

Computational biologists have demonstrated the utility of using machine learning methods to predict protein function from an integration of multiple genome-wide data types. Yet, even the best performing function prediction algorithms rely on heuristics for important components of the algorithm, such as choosing negative examples (proteins without a given function) or determining key parameters. The improper choice of negative examples, in particular, can hamper the accuracy of protein function prediction. We present a novel approach for choosing negative examples, using a parameterizable Bayesian prior computed from all observed annotation data, which also generates priors used during function prediction. We incorporate this new method into the GeneMANIA function prediction algorithm and demonstrate improved accuracy of our algorithm over current top-performing function prediction methods on the yeast and mouse proteomes across all metrics tested. Code and Data are available at: http://bonneaulab.bio.nyu.edu/funcprop.html
Nutrient Digestibility and Metabolizable Energy Content of Whole Pods Fed to Growing Pelibuey Lambs

Directory of Open Access Journals (Sweden)

Enrique Loyra-Tzab

2013-07-01

Full Text Available The nutrient digestibility, nitrogen balance and in vivo metabolizable energy supply of Mucuna pruriens whole pods fed to growing Pelibuey lambs was investigated. Eight Pelibuey sheep housed in metabolic crates were fed increasing levels of Mucuna pruriens pods: 0 (control, 100 (Mucuna100, 200 (Mucuna200 and 300 (Mucuna300 g/kg dry matter. A quadratic (p0.05 on DM and GE apparent digestibility (p0.05. DM, N and GE apparent digestibility coefficient of M. pruriens whole pods obtained through multiple regression equations were 0.692, 0.457, 0.654 respectively. In vivo DE and ME content of mucuna whole pod were estimated in 11.0 and 9.7 MJ/kg DM. It was concluded that whole pods from M. pruriens did not affect nutrient utilization when included in an mixed diet up to 200 g/kg DM. This is the first in vivo estimation of mucuna whole pod ME value for ruminants.
Quantitative analysis and prediction of curvature in leucine-rich repeat proteins.

Science.gov (United States)

Hindle, K Lauren; Bella, Jordi; Lovell, Simon C

2009-11-01

Leucine-rich repeat (LRR) proteins form a large and diverse family. They have a wide range of functions most of which involve the formation of protein-protein interactions. All known LRR structures form curved solenoids, although there is large variation in their curvature. It is this curvature that determines the shape and dimensions of the inner space available for ligand binding. Unfortunately, large-scale parameters such as the overall curvature of a protein domain are extremely difficult to predict. Here, we present a quantitative analysis of determinants of curvature of this family. Individual repeats typically range in length between 20 and 30 residues and have a variety of secondary structures on their convex side. The observed curvature of the LRR domains correlates poorly with the lengths of their individual repeats. We have, therefore, developed a scoring function based on the secondary structure of the convex side of the protein that allows prediction of the overall curvature with a high degree of accuracy. We also demonstrate the effectiveness of this method in selecting a suitable template for comparative modeling. We have developed an automated, quantitative protocol that can be used to predict accurately the curvature of leucine-rich repeat proteins of unknown structure from sequence alone. This protocol is available as an online resource at http://www.bioinf.manchester.ac.uk/curlrr/.
A predicted protein interactome identifies conserved global networks and disease resistance subnetworks in maize.

Directory of Open Access Journals (Sweden)

Matt eGeisler

2015-06-01

Full Text Available Interactomes are genome-wide roadmaps of protein-protein interactions. They have been produced for humans, yeast, the fruit fly, and Arabidopsis thaliana and have become invaluable tools for generating and testing hypotheses. A predicted interactome for Zea mays (PiZeaM is presented here as an aid to the research community for this valuable crop species. PiZeaM was built using a proven method of interologs (interacting orthologs that were identified using both one-to-one and many-to-many orthology between genomes of maize and reference species. Where both maize orthologs occurred for an experimentally determined interaction in the reference species, we predicted a likely interaction in maize. A total of 49,026 unique interactions for 6,004 maize proteins were predicted. These interactions are enriched for processes that are evolutionarily conserved, but include many otherwise poorly annotated proteins in maize. The predicted maize interactions were further analyzed by comparing annotation of interacting proteins, including different layers of ontology. A map of pairwise gene co-expression was also generated and compared to predicted interactions. Two global subnetworks were constructed for highly conserved interactions. These subnetworks showed clear clustering of proteins by function. Another subnetwork was created for disease response using a bait and prey strategy to capture interacting partners for proteins that respond to other organisms. Closer examination of this subnetwork revealed the connectivity between biotic and abiotic hormone stress pathways. We believe PiZeaM will provide a useful tool for the prediction of protein function and analysis of pathways for Z. mays researchers and is presented in this paper as a reference tool for the exploration of protein interactions in maize.
Improved hybrid optimization algorithm for 3D protein structure prediction.

Science.gov (United States)

Zhou, Changjun; Hou, Caixia; Wei, Xiaopeng; Zhang, Qiang

2014-07-01

A new improved hybrid optimization algorithm - PGATS algorithm, which is based on toy off-lattice model, is presented for dealing with three-dimensional protein structure prediction problems. The algorithm combines the particle swarm optimization (PSO), genetic algorithm (GA), and tabu search (TS) algorithms. Otherwise, we also take some different improved strategies. The factor of stochastic disturbance is joined in the particle swarm optimization to improve the search ability; the operations of crossover and mutation that are in the genetic algorithm are changed to a kind of random liner method; at last tabu search algorithm is improved by appending a mutation operator. Through the combination of a variety of strategies and algorithms, the protein structure prediction (PSP) in a 3D off-lattice model is achieved. The PSP problem is an NP-hard problem, but the problem can be attributed to a global optimization problem of multi-extremum and multi-parameters. This is the theoretical principle of the hybrid optimization algorithm that is proposed in this paper. The algorithm combines local search and global search, which overcomes the shortcoming of a single algorithm, giving full play to the advantage of each algorithm. In the current universal standard sequences, Fibonacci sequences and real protein sequences are certified. Experiments show that the proposed new method outperforms single algorithms on the accuracy of calculating the protein sequence energy value, which is proved to be an effective way to predict the structure of proteins.
A novel representation for apoptosis protein subcellular localization prediction using support vector machine.

Science.gov (United States)

Zhang, Li; Liao, Bo; Li, Dachao; Zhu, Wen

2009-07-21

Apoptosis, or programmed cell death, plays an important role in development of an organism. Obtaining information on subcellular location of apoptosis proteins is very helpful to understand the apoptosis mechanism. In this paper, based on the concept that the position distribution information of amino acids is closely related with the structure and function of proteins, we introduce the concept of distance frequency [Matsuda, S., Vert, J.P., Ueda, N., Toh, H., Akutsu, T., 2005. A novel representation of protein sequences for prediction of subcellular location using support vector machines. Protein Sci. 14, 2804-2813] and propose a novel way to calculate distance frequencies. In order to calculate the local features, each protein sequence is separated into p parts with the same length in our paper. Then we use the novel representation of protein sequences and adopt support vector machine to predict subcellular location. The overall prediction accuracy is significantly improved by jackknife test.
Predicting binding within disordered protein regions to structurally characterised peptide-binding domains.

Directory of Open Access Journals (Sweden)

Waqasuddin Khan

Full Text Available Disordered regions of proteins often bind to structured domains, mediating interactions within and between proteins. However, it is difficult to identify a priori the short disordered regions involved in binding. We set out to determine if docking such peptide regions to peptide binding domains would assist in these predictions.We assembled a redundancy reduced dataset of SLiM (Short Linear Motif containing proteins from the ELM database. We selected 84 sequences which had an associated PDB structures showing the SLiM bound to a protein receptor, where the SLiM was found within a 50 residue region of the protein sequence which was predicted to be disordered. First, we investigated the Vina docking scores of overlapping tripeptides from the 50 residue SLiM containing disordered regions of the protein sequence to the corresponding PDB domain. We found only weak discrimination of docking scores between peptides involved in binding and adjacent non-binding peptides in this context (AUC 0.58.Next, we trained a bidirectional recurrent neural network (BRNN using as input the protein sequence, predicted secondary structure, Vina docking score and predicted disorder score. The results were very promising (AUC 0.72 showing that multiple sources of information can be combined to produce results which are clearly superior to any single source.We conclude that the Vina docking score alone has only modest power to define the location of a peptide within a larger protein region known to contain it. However, combining this information with other knowledge (using machine learning methods clearly improves the identification of peptide binding regions within a protein sequence. This approach combining docking with machine learning is primarily a predictor of binding to peptide-binding sites, and is not intended as a predictor of specificity of binding to particular receptors.
Prediction and analysis of beta-turns in proteins by support vector machine.

Science.gov (United States)

Pham, Tho Hoan; Satou, Kenji; Ho, Tu Bao

2003-01-01

Tight turn has long been recognized as one of the three important features of proteins after the alpha-helix and beta-sheet. Tight turns play an important role in globular proteins from both the structural and functional points of view. More than 90% tight turns are beta-turns. Analysis and prediction of beta-turns in particular and tight turns in general are very useful for the design of new molecules such as drugs, pesticides, and antigens. In this paper, we introduce a support vector machine (SVM) approach to prediction and analysis of beta-turns. We have investigated two aspects of applying SVM to the prediction and analysis of beta-turns. First, we developed a new SVM method, called BTSVM, which predicts beta-turns of a protein from its sequence. The prediction results on the dataset of 426 non-homologous protein chains by sevenfold cross-validation technique showed that our method is superior to the other previous methods. Second, we analyzed how amino acid positions support (or prevent) the formation of beta-turns based on the "multivariable" classification model of a linear SVM. This model is more general than the other ones of previous statistical methods. Our analysis results are more comprehensive and easier to use than previously published analysis results.
Multi-label learning with fuzzy hypergraph regularization for protein subcellular location prediction.

Science.gov (United States)

Chen, Jing; Tang, Yuan Yan; Chen, C L Philip; Fang, Bin; Lin, Yuewei; Shang, Zhaowei

2014-12-01

Protein subcellular location prediction aims to predict the location where a protein resides within a cell using computational methods. Considering the main limitations of the existing methods, we propose a hierarchical multi-label learning model FHML for both single-location proteins and multi-location proteins. The latent concepts are extracted through feature space decomposition and label space decomposition under the nonnegative data factorization framework. The extracted latent concepts are used as the codebook to indirectly connect the protein features to their annotations. We construct dual fuzzy hypergraphs to capture the intrinsic high-order relations embedded in not only feature space, but also label space. Finally, the subcellular location annotation information is propagated from the labeled proteins to the unlabeled proteins by performing dual fuzzy hypergraph Laplacian regularization. The experimental results on the six protein benchmark datasets demonstrate the superiority of our proposed method by comparing it with the state-of-the-art methods, and illustrate the benefit of exploiting both feature correlations and label correlations.

Characterization and Prediction of Protein Phosphorylation Hotspots in Arabidopsis thaliana.

Science.gov (United States)

Christian, Jan-Ole; Braginets, Rostyslav; Schulze, Waltraud X; Walther, Dirk

2012-01-01

The regulation of protein function by modulating the surface charge status via sequence-locally enriched phosphorylation sites (P-sites) in so called phosphorylation "hotspots" has gained increased attention in recent years. We set out to identify P-hotspots in the model plant Arabidopsis thaliana. We analyzed the spacing of experimentally detected P-sites within peptide-covered regions along Arabidopsis protein sequences as available from the PhosPhAt database. Confirming earlier reports (Schweiger and Linial, 2010), we found that, indeed, P-sites tend to cluster and that distributions between serine and threonine P-sites to their respected closest next P-site differ significantly from those for tyrosine P-sites. The ability to predict P-hotspots by applying available computational P-site prediction programs that focus on identifying single P-sites was observed to be severely compromised by the inevitable interference of nearby P-sites. We devised a new approach, named HotSPotter, for the prediction of phosphorylation hotspots. HotSPotter is based primarily on local amino acid compositional preferences rather than sequence position-specific motifs and uses support vector machines as the underlying classification engine. HotSPotter correctly identified experimentally determined phosphorylation hotspots in A. thaliana with high accuracy. Applied to the Arabidopsis proteome, HotSPotter-predicted 13,677 candidate P-hotspots in 9,599 proteins corresponding to 7,847 unique genes. Hotspot containing proteins are involved predominantly in signaling processes confirming the surmised modulating role of hotspots in signaling and interaction events. Our study provides new bioinformatics means to identify phosphorylation hotspots and lays the basis for further investigating novel candidate P-hotspots. All phosphorylation hotspot annotations and predictions have been made available as part of the PhosPhAt database at http://phosphat.mpimp-golm.mpg.de.
Predicting the tolerated sequences for proteins and protein interfaces using RosettaBackrub flexible backbone design.

Directory of Open Access Journals (Sweden)

Colin A Smith

Full Text Available Predicting the set of sequences that are tolerated by a protein or protein interface, while maintaining a desired function, is useful for characterizing protein interaction specificity and for computationally designing sequence libraries to engineer proteins with new functions. Here we provide a general method, a detailed set of protocols, and several benchmarks and analyses for estimating tolerated sequences using flexible backbone protein design implemented in the Rosetta molecular modeling software suite. The input to the method is at least one experimentally determined three-dimensional protein structure or high-quality model. The starting structure(s are expanded or refined into a conformational ensemble using Monte Carlo simulations consisting of backrub backbone and side chain moves in Rosetta. The method then uses a combination of simulated annealing and genetic algorithm optimization methods to enrich for low-energy sequences for the individual members of the ensemble. To emphasize certain functional requirements (e.g. forming a binding interface, interactions between and within parts of the structure (e.g. domains can be reweighted in the scoring function. Results from each backbone structure are merged together to create a single estimate for the tolerated sequence space. We provide an extensive description of the protocol and its parameters, all source code, example analysis scripts and three tests applying this method to finding sequences predicted to stabilize proteins or protein interfaces. The generality of this method makes many other applications possible, for example stabilizing interactions with small molecules, DNA, or RNA. Through the use of within-domain reweighting and/or multistate design, it may also be possible to use this method to find sequences that stabilize particular protein conformations or binding interactions over others.
HitPredict version 4: comprehensive reliability scoring of physical protein?protein interactions from more than 100 species

OpenAIRE

L?pez, Yosvany; Nakai, Kenta; Patil, Ashwini

2015-01-01

HitPredict is a consolidated resource of experimentally identified, physical protein?protein interactions with confidence scores to indicate their reliability. The study of genes and their inter-relationships using methods such as network and pathway analysis requires high quality protein?protein interaction information. Extracting reliable interactions from most of the existing databases is challenging because they either contain only a subset of the available interactions, or a mixture of p...
InterProSurf: a web server for predicting interacting sites on protein surfaces

Science.gov (United States)

Negi, Surendra S.; Schein, Catherine H.; Oezguen, Numan; Power, Trevor D.; Braun, Werner

2009-01-01

Summary A new web server, InterProSurf, predicts interacting amino acid residues in proteins that are most likely to interact with other proteins, given the 3D structures of subunits of a protein complex. The prediction method is based on solvent accessible surface area of residues in the isolated subunits, a propensity scale for interface residues and a clustering algorithm to identify surface regions with residues of high interface propensities. Here we illustrate the application of InterProSurf to determine which areas of Bacillus anthracis toxins and measles virus hemagglutinin protein interact with their respective cell surface receptors. The computationally predicted regions overlap with those regions previously identified as interface regions by sequence analysis and mutagenesis experiments. PMID:17933856
Composição química, digestibilidade e predição dos valores energéticos da farinha de carne e ossos para suínos - DOI: 10.4025/actascianimsci.v30i1.3597 Chemical Composition, Digestibility and Prediction of the Energy Values of Meat and Bone Meal for Swine - DOI: 10.4025/actascianimsci.v30i1.3597

Directory of Open Access Journals (Sweden)

Horácio Santiago Rostagno

2008-06-01

Full Text Available O objetivo do trabalho foi determinar a composição química e energética de seis diferentes farinhas de carne e ossos, bem como desenvolver equações de predição da energia digestível e metabolizável, com base na composição química dos alimentos. Foram utilizados 28 suínos, mestiços, machos castrados, com peso médio inicial de 25,90 ± 1,95 kg, distribuídos em delineamento experimental de blocos ao acaso, com sete tratamentos, quatro repetições e um animal por unidade experimental. Os tratamentos consistiram de uma ração-referência e seis diferentes farinhas de carne e ossos, que substituíram em 20% a ração-referência. Os valores de energia digestível e metabolizável variaram de 1.717 a 2.908 kcal kg-1 e de 1.519 a 2.608 kcal kg-1, respectivamente. As equações de predição da energia digestível e metabolizável que apresentaram maiores R2 para a farinha de carne e ossos foram: ED = 1.196,11 + 44,18 PB – 121,55 P e EM = 2.103,35 + 22,56 PB – 164,02 P.The objective of this study was to determine the chemical and energetic composition of six different meat and bone meals, and to develop prediction equations of digestible and metabolizable energy based on the chemical composition of the feeds. In order to determine the digestible and metabolizable energy values, 28 crossbreed swine were used – castrated males, averaging 25.90 ± 1.95 kg initial weight, allotted in a randomized block design with seven treatments, four replicates and one animal per experimental unit. The treatments consisted of a basal diet and six meat and bone meals, which replaced by 20% the basal diet. The digestible and metabolizable energy values varied from 1717 to 2908 kcal kg-1 and from 1519 to 2608 kcal kg-1, respectively. The prediction equation of digestible and metabolizable energy that presented the highest R2 for meat and bone meal were: DE = 1196.11 + 44.18 CP – 121.55 P and ME = 2103.35 + 22.56 CP – 164.02 P.
A web server for analysis, comparison and prediction of protein ligand binding sites.

Science.gov (United States)

Singh, Harinder; Srivastava, Hemant Kumar; Raghava, Gajendra P S

2016-03-25

One of the major challenges in the field of system biology is to understand the interaction between a wide range of proteins and ligands. In the past, methods have been developed for predicting binding sites in a protein for a limited number of ligands. In order to address this problem, we developed a web server named 'LPIcom' to facilitate users in understanding protein-ligand interaction. Analysis, comparison and prediction modules are available in the "LPIcom' server to predict protein-ligand interacting residues for 824 ligands. Each ligand must have at least 30 protein binding sites in PDB. Analysis module of the server can identify residues preferred in interaction and binding motif for a given ligand; for example residues glycine, lysine and arginine are preferred in ATP binding sites. Comparison module of the server allows comparing protein-binding sites of multiple ligands to understand the similarity between ligands based on their binding site. This module indicates that ATP, ADP and GTP ligands are in the same cluster and thus their binding sites or interacting residues exhibit a high level of similarity. Propensity-based prediction module has been developed for predicting ligand-interacting residues in a protein for more than 800 ligands. In addition, a number of web-based tools have been integrated to facilitate users in creating web logo and two-sample between ligand interacting and non-interacting residues. In summary, this manuscript presents a web-server for analysis of ligand interacting residue. This server is available for public use from URL http://crdd.osdd.net/raghava/lpicom .
Prediction of Carbohydrate-Binding Proteins from Sequences Using Support Vector Machines

Directory of Open Access Journals (Sweden)

Seizi Someya

2010-01-01

Full Text Available Carbohydrate-binding proteins are proteins that can interact with sugar chains but do not modify them. They are involved in many physiological functions, and we have developed a method for predicting them from their amino acid sequences. Our method is based on support vector machines (SVMs. We first clarified the definition of carbohydrate-binding proteins and then constructed positive and negative datasets with which the SVMs were trained. By applying the leave-one-out test to these datasets, our method delivered 0.92 of the area under the receiver operating characteristic (ROC curve. We also examined two amino acid grouping methods that enable effective learning of sequence patterns and evaluated the performance of these methods. When we applied our method in combination with the homology-based prediction method to the annotated human genome database, H-invDB, we found that the true positive rate of prediction was improved.
Computational methods using weighed-extreme learning machine to predict protein self-interactions with protein evolutionary information.

Science.gov (United States)

An, Ji-Yong; Zhang, Lei; Zhou, Yong; Zhao, Yu-Jun; Wang, Da-Fu

2017-08-18

Self-interactions Proteins (SIPs) is important for their biological activity owing to the inherent interaction amongst their secondary structures or domains. However, due to the limitations of experimental Self-interactions detection, one major challenge in the study of prediction SIPs is how to exploit computational approaches for SIPs detection based on evolutionary information contained protein sequence. In the work, we presented a novel computational approach named WELM-LAG, which combined the Weighed-Extreme Learning Machine (WELM) classifier with Local Average Group (LAG) to predict SIPs based on protein sequence. The major improvement of our method lies in presenting an effective feature extraction method used to represent candidate Self-interactions proteins by exploring the evolutionary information embedded in PSI-BLAST-constructed position specific scoring matrix (PSSM); and then employing a reliable and robust WELM classifier to carry out classification. In addition, the Principal Component Analysis (PCA) approach is used to reduce the impact of noise. The WELM-LAG method gave very high average accuracies of 92.94 and 96.74% on yeast and human datasets, respectively. Meanwhile, we compared it with the state-of-the-art support vector machine (SVM) classifier and other existing methods on human and yeast datasets, respectively. Comparative results indicated that our approach is very promising and may provide a cost-effective alternative for predicting SIPs. In addition, we developed a freely available web server called WELM-LAG-SIPs to predict SIPs. The web server is available at http://219.219.62.123:8888/WELMLAG/ .
Protein secondary structure prediction for a single-sequence using hidden semi-Markov models

Directory of Open Access Journals (Sweden)

Borodovsky Mark

2006-03-01

Full Text Available Abstract Background The accuracy of protein secondary structure prediction has been improving steadily towards the 88% estimated theoretical limit. There are two types of prediction algorithms: Single-sequence prediction algorithms imply that information about other (homologous proteins is not available, while algorithms of the second type imply that information about homologous proteins is available, and use it intensively. The single-sequence algorithms could make an important contribution to studies of proteins with no detected homologs, however the accuracy of protein secondary structure prediction from a single-sequence is not as high as when the additional evolutionary information is present. Results In this paper, we further refine and extend the hidden semi-Markov model (HSMM initially considered in the BSPSS algorithm. We introduce an improved residue dependency model by considering the patterns of statistically significant amino acid correlation at structural segment borders. We also derive models that specialize on different sections of the dependency structure and incorporate them into HSMM. In addition, we implement an iterative training method to refine estimates of HSMM parameters. The three-state-per-residue accuracy and other accuracy measures of the new method, IPSSP, are shown to be comparable or better than ones for BSPSS as well as for PSIPRED, tested under the single-sequence condition. Conclusions We have shown that new dependency models and training methods bring further improvements to single-sequence protein secondary structure prediction. The results are obtained under cross-validation conditions using a dataset with no pair of sequences having significant sequence similarity. As new sequences are added to the database it is possible to augment the dependency structure and obtain even higher accuracy. Current and future advances should contribute to the improvement of function prediction for orphan proteins inscrutable
Sequence-based feature prediction and annotation of proteins

DEFF Research Database (Denmark)

Juncker, Agnieszka; Jensen, Lars J.; Pierleoni, Andrea

2009-01-01

A recent trend in computational methods for annotation of protein function is that many prediction tools are combined in complex workflows and pipelines to facilitate the analysis of feature combinations, for example, the entire repertoire of kinase-binding motifs in the human proteome....
Predicting Ligand Binding Sites on Protein Surfaces by 3-Dimensional Probability Density Distributions of Interacting Atoms

Science.gov (United States)

Jian, Jhih-Wei; Elumalai, Pavadai; Pitti, Thejkiran; Wu, Chih Yuan; Tsai, Keng-Chang; Chang, Jeng-Yih; Peng, Hung-Pin; Yang, An-Suei

2016-01-01

Predicting ligand binding sites (LBSs) on protein structures, which are obtained either from experimental or computational methods, is a useful first step in functional annotation or structure-based drug design for the protein structures. In this work, the structure-based machine learning algorithm ISMBLab-LIG was developed to predict LBSs on protein surfaces with input attributes derived from the three-dimensional probability density maps of interacting atoms, which were reconstructed on the query protein surfaces and were relatively insensitive to local conformational variations of the tentative ligand binding sites. The prediction accuracy of the ISMBLab-LIG predictors is comparable to that of the best LBS predictors benchmarked on several well-established testing datasets. More importantly, the ISMBLab-LIG algorithm has substantial tolerance to the prediction uncertainties of computationally derived protein structure models. As such, the method is particularly useful for predicting LBSs not only on experimental protein structures without known LBS templates in the database but also on computationally predicted model protein structures with structural uncertainties in the tentative ligand binding sites. PMID:27513851
A Deep Learning Network Approach to ab initio Protein Secondary Structure Prediction.

Science.gov (United States)

Spencer, Matt; Eickholt, Jesse; Jianlin Cheng

2015-01-01

Ab initio protein secondary structure (SS) predictions are utilized to generate tertiary structure predictions, which are increasingly demanded due to the rapid discovery of proteins. Although recent developments have slightly exceeded previous methods of SS prediction, accuracy has stagnated around 80 percent and many wonder if prediction cannot be advanced beyond this ceiling. Disciplines that have traditionally employed neural networks are experimenting with novel deep learning techniques in attempts to stimulate progress. Since neural networks have historically played an important role in SS prediction, we wanted to determine whether deep learning could contribute to the advancement of this field as well. We developed an SS predictor that makes use of the position-specific scoring matrix generated by PSI-BLAST and deep learning network architectures, which we call DNSS. Graphical processing units and CUDA software optimize the deep network architecture and efficiently train the deep networks. Optimal parameters for the training process were determined, and a workflow comprising three separately trained deep networks was constructed in order to make refined predictions. This deep learning network approach was used to predict SS for a fully independent test dataset of 198 proteins, achieving a Q3 accuracy of 80.7 percent and a Sov accuracy of 74.2 percent.
Prediction of membrane transport proteins and their substrate specificities using primary sequence information.

Directory of Open Access Journals (Sweden)

Nitish K Mishra

Full Text Available Membrane transport proteins (transporters move hydrophilic substrates across hydrophobic membranes and play vital roles in most cellular functions. Transporters represent a diverse group of proteins that differ in topology, energy coupling mechanism, and substrate specificity as well as sequence similarity. Among the functional annotations of transporters, information about their transporting substrates is especially important. The experimental identification and characterization of transporters is currently costly and time-consuming. The development of robust bioinformatics-based methods for the prediction of membrane transport proteins and their substrate specificities is therefore an important and urgent task.Support vector machine (SVM-based computational models, which comprehensively utilize integrative protein sequence features such as amino acid composition, dipeptide composition, physico-chemical composition, biochemical composition, and position-specific scoring matrices (PSSM, were developed to predict the substrate specificity of seven transporter classes: amino acid, anion, cation, electron, protein/mRNA, sugar, and other transporters. An additional model to differentiate transporters from non-transporters was also developed. Among the developed models, the biochemical composition and PSSM hybrid model outperformed other models and achieved an overall average prediction accuracy of 76.69% with a Mathews correlation coefficient (MCC of 0.49 and a receiver operating characteristic area under the curve (AUC of 0.833 on our main dataset. This model also achieved an overall average prediction accuracy of 78.88% and MCC of 0.41 on an independent dataset.Our analyses suggest that evolutionary information (i.e., the PSSM and the AAIndex are key features for the substrate specificity prediction of transport proteins. In comparison, similarity-based methods such as BLAST, PSI-BLAST, and hidden Markov models do not provide accurate predictions
Rationally designed synthetic protein hydrogels with predictable mechanical properties.

Science.gov (United States)

Wu, Junhua; Li, Pengfei; Dong, Chenling; Jiang, Heting; Bin Xue; Gao, Xiang; Qin, Meng; Wang, Wei; Bin Chen; Cao, Yi

2018-02-12

Designing synthetic protein hydrogels with tailored mechanical properties similar to naturally occurring tissues is an eternal pursuit in tissue engineering and stem cell and cancer research. However, it remains challenging to correlate the mechanical properties of protein hydrogels with the nanomechanics of individual building blocks. Here we use single-molecule force spectroscopy, protein engineering and theoretical modeling to prove that the mechanical properties of protein hydrogels are predictable based on the mechanical hierarchy of the cross-linkers and the load-bearing modules at the molecular level. These findings provide a framework for rationally designing protein hydrogels with independently tunable elasticity, extensibility, toughness and self-healing. Using this principle, we demonstrate the engineering of self-healable muscle-mimicking hydrogels that can significantly dissipate energy through protein unfolding. We expect that this principle can be generalized for the construction of protein hydrogels with customized mechanical properties for biomedical applications.
NetTurnP--neural network prediction of beta-turns by use of evolutionary information and predicted protein sequence features.

Directory of Open Access Journals (Sweden)

Bent Petersen

Full Text Available UNLABELLED: β-turns are the most common type of non-repetitive structures, and constitute on average 25% of the amino acids in proteins. The formation of β-turns plays an important role in protein folding, protein stability and molecular recognition processes. In this work we present the neural network method NetTurnP, for prediction of two-class β-turns and prediction of the individual β-turn types, by use of evolutionary information and predicted protein sequence features. It has been evaluated against a commonly used dataset BT426, and achieves a Matthews correlation coefficient of 0.50, which is the highest reported performance on a two-class prediction of β-turn and not-β-turn. Furthermore NetTurnP shows improved performance on some of the specific β-turn types. In the present work, neural network methods have been trained to predict β-turn or not and individual β-turn types from the primary amino acid sequence. The individual β-turn types I, I', II, II', VIII, VIa1, VIa2, VIba and IV have been predicted based on classifications by PROMOTIF, and the two-class prediction of β-turn or not is a superset comprised of all β-turn types. The performance is evaluated using a golden set of non-homologous sequences known as BT426. Our two-class prediction method achieves a performance of: MCC=0.50, Qtotal=82.1%, sensitivity=75.6%, PPV=68.8% and AUC=0.864. We have compared our performance to eleven other prediction methods that obtain Matthews correlation coefficients in the range of 0.17-0.47. For the type specific β-turn predictions, only type I and II can be predicted with reasonable Matthews correlation coefficients, where we obtain performance values of 0.36 and 0.31, respectively. CONCLUSION: The NetTurnP method has been implemented as a webserver, which is freely available at http://www.cbs.dtu.dk/services/NetTurnP/. NetTurnP is the only available webserver that allows submission of multiple sequences.
NetTurnP--neural network prediction of beta-turns by use of evolutionary information and predicted protein sequence features.

Science.gov (United States)

Petersen, Bent; Lundegaard, Claus; Petersen, Thomas Nordahl

2010-11-30

β-turns are the most common type of non-repetitive structures, and constitute on average 25% of the amino acids in proteins. The formation of β-turns plays an important role in protein folding, protein stability and molecular recognition processes. In this work we present the neural network method NetTurnP, for prediction of two-class β-turns and prediction of the individual β-turn types, by use of evolutionary information and predicted protein sequence features. It has been evaluated against a commonly used dataset BT426, and achieves a Matthews correlation coefficient of 0.50, which is the highest reported performance on a two-class prediction of β-turn and not-β-turn. Furthermore NetTurnP shows improved performance on some of the specific β-turn types. In the present work, neural network methods have been trained to predict β-turn or not and individual β-turn types from the primary amino acid sequence. The individual β-turn types I, I', II, II', VIII, VIa1, VIa2, VIba and IV have been predicted based on classifications by PROMOTIF, and the two-class prediction of β-turn or not is a superset comprised of all β-turn types. The performance is evaluated using a golden set of non-homologous sequences known as BT426. Our two-class prediction method achieves a performance of: MCC=0.50, Qtotal=82.1%, sensitivity=75.6%, PPV=68.8% and AUC=0.864. We have compared our performance to eleven other prediction methods that obtain Matthews correlation coefficients in the range of 0.17-0.47. For the type specific β-turn predictions, only type I and II can be predicted with reasonable Matthews correlation coefficients, where we obtain performance values of 0.36 and 0.31, respectively. The NetTurnP method has been implemented as a webserver, which is freely available at http://www.cbs.dtu.dk/services/NetTurnP/. NetTurnP is the only available webserver that allows submission of multiple sequences.
Predictive and comparative analysis of Ebolavirus proteins

Science.gov (United States)

Cong, Qian; Pei, Jimin; Grishin, Nick V

2015-01-01

Ebolavirus is the pathogen for Ebola Hemorrhagic Fever (EHF). This disease exhibits a high fatality rate and has recently reached a historically epidemic proportion in West Africa. Out of the 5 known Ebolavirus species, only Reston ebolavirus has lost human pathogenicity, while retaining the ability to cause EHF in long-tailed macaque. Significant efforts have been spent to determine the three-dimensional (3D) structures of Ebolavirus proteins, to study their interaction with host proteins, and to identify the functional motifs in these viral proteins. Here, in light of these experimental results, we apply computational analysis to predict the 3D structures and functional sites for Ebolavirus protein domains with unknown structure, including a zinc-finger domain of VP30, the RNA-dependent RNA polymerase catalytic domain and a methyltransferase domain of protein L. In addition, we compare sequences of proteins that interact with Ebolavirus proteins from RESTV-resistant primates with those from RESTV-susceptible monkeys. The host proteins that interact with GP and VP35 show an elevated level of sequence divergence between the RESTV-resistant and RESTV-susceptible species, suggesting that they may be responsible for host specificity. Meanwhile, we detect variable positions in protein sequences that are likely associated with the loss of human pathogenicity in RESTV, map them onto the 3D structures and compare their positions to known functional sites. VP35 and VP30 are significantly enriched in these potential pathogenicity determinants and the clustering of such positions on the surfaces of VP35 and GP suggests possible uncharacterized interaction sites with host proteins that contribute to the virulence of Ebolavirus. PMID:26158395
Predictive and comparative analysis of Ebolavirus proteins.

Science.gov (United States)

Cong, Qian; Pei, Jimin; Grishin, Nick V

2015-01-01

Ebolavirus is the pathogen for Ebola Hemorrhagic Fever (EHF). This disease exhibits a high fatality rate and has recently reached a historically epidemic proportion in West Africa. Out of the 5 known Ebolavirus species, only Reston ebolavirus has lost human pathogenicity, while retaining the ability to cause EHF in long-tailed macaque. Significant efforts have been spent to determine the three-dimensional (3D) structures of Ebolavirus proteins, to study their interaction with host proteins, and to identify the functional motifs in these viral proteins. Here, in light of these experimental results, we apply computational analysis to predict the 3D structures and functional sites for Ebolavirus protein domains with unknown structure, including a zinc-finger domain of VP30, the RNA-dependent RNA polymerase catalytic domain and a methyltransferase domain of protein L. In addition, we compare sequences of proteins that interact with Ebolavirus proteins from RESTV-resistant primates with those from RESTV-susceptible monkeys. The host proteins that interact with GP and VP35 show an elevated level of sequence divergence between the RESTV-resistant and RESTV-susceptible species, suggesting that they may be responsible for host specificity. Meanwhile, we detect variable positions in protein sequences that are likely associated with the loss of human pathogenicity in RESTV, map them onto the 3D structures and compare their positions to known functional sites. VP35 and VP30 are significantly enriched in these potential pathogenicity determinants and the clustering of such positions on the surfaces of VP35 and GP suggests possible uncharacterized interaction sites with host proteins that contribute to the virulence of Ebolavirus.
Energia metabolizável de ingredientes determinada com codornas japonesas (Coturnix coturnix japonica Metabolizable energy of feedstuffs determined in japanese quails (Coturnix coturnix japonica

Directory of Open Access Journals (Sweden)

José Humberto Vilar da Silva

2003-12-01

Full Text Available O experimento um foi realizado para determinar a energia metabolizável aparente (EMA e corrigida pelo balanço de nitrogênio (EMAn de nove alimentos utilizando codornas japonesas em crescimento. No experimento dois, objetivou-se comparar formulações de rações utilizando EMAn do milho e do farelo de soja determinada para frangos de corte e poedeiras, com aquelas determinadas com codornas com 22 a 27 dias de idade e 65 dias de idade. No experimento um, foram utilizadas 400 codornas em crescimento recebendo uma dieta basal (DB e nove misturas compostas por 70% da DB + 30% dos alimentos testes, totalizando dez tratamentos, cada um com quatro repetições de dez aves. No experimento dois, 160 codornas européias em postura receberam três tratamentos durante três períodos de 15 dias de duração, com doze repetições de cinco aves. Os valores de EMA e EMAn (kcal/kg determinados para os alimentos de origem vegetal foram, respectivamente, 3.340 e 3.354 para o milho moído, 2.718 e 2.456 para o farelo de soja, 3.453 e 3.084 para a soja integral extrusada, 1.624 e 1.593 para o farelo de trigo, 4.558 e 3.992 para o farelo de glúten de milho, 3.329 e 3.378 para a farinha de mandioca e 1.238 e 1.223 para a farinha integral da vagem de algaroba e para os alimentos de origem animal, respectivamente, de 2874 e 2453 para a farinha de peixe e 3090 e 2791 para a farinha de vísceras. A EMAn do milho e do farelo de soja estimada com codornas não melhorou o consumo, produção, peso e conversão por massa de ovos, validando o uso da energia desses ingredientes determinada com frangos de corte e poedeiras para compor rações para codornas.The experiment one was carried out to determine apparent metabolizable energy (AME and nitrogen-corrected ME (AMEn of nine feedstuffs in Japanese quails. The objective of the experiment two was to compare diets formulated with AMEn of corn and soybean meal, usually fed to broilers and laying hens, with diets formulated
Estimates of nutritional requirements and use of Small Ruminant Nutrition System model for hair sheep in semiarid conditions

Directory of Open Access Journals (Sweden)

Alessandra Pinto de Oliveira

2014-09-01

Full Text Available The objective was to determine the efficiency of utilization of metabolizable energy for maintenance (km and weight gain (kf, the dietary requirements of total digestible nutrients (TDN and metabolizable protein (MP, as well as, evaluate the Small Ruminant Nutrition System (SRNS model to predict the dry matter intake (DMI and the average daily gain (ADG of Santa Ines lambs, fed diets containing different levels of metabolizable energy (ME. Thirty five lambs, non-castrated, with initial body weight (BW of 14.77 ± 1.26 kg at approximate two months old, were used. At the beginning of the experiment, five animals were slaughtered to serve as reference for the estimative of empty body weight (EBW and initial body composition of the 30 remaining animals, which were distributed in randomized block design with five treatments (1.13; 1.40; 1.73; 2.22 and 2.60 Mcal/kg DM, and six repetitions. The requirement of metabolizable energy for maintenance was 78.53 kcal/kg EBW0,75/day, with a utilization efficiency of 66%. The average value of efficiency of metabolizable energy utilization for weight gain was 48%. The dietary requirements of TDN and MP increased with the increase in BW and ADG of the animals. The SRNS model underestimated the DMI and ADG of the animals in 6.2% and 24.6%, respectively. Concludes that the values of km and kf are consistent with those observed in several studies with lambs created in the tropics. The dietary requirements of TDN and MP of Santa Ines lambs for different BW and ADG are, approximately, 42% and 24%, respectively, lower than those suggested by the american system of evaluation of food and nutrient requirements of small ruminants. The SRNS model was sensitive to predict the DMI in Santa Ines lambs, however, for variable ADG, more studies are needed, since the model underestimated the response of the animals of this study.

C-Reactive Protein, Fibrinogen, and Cardiovascular Disease Prediction

NARCIS (Netherlands)

Kaptoge, Stephen; Di Angelantonio, Emanuele; Pennells, Lisa; Wood, Angela M.; White, Ian R.; Gao, Pei; Walker, Matthew; Thompson, Alexander; Sarwar, Nadeem; Caslake, Muriel; Butterworth, Adam S.; Amouyel, Philippe; Assmann, Gerd; Bakker, Stephan J. L.; Barr, Elizabeth L. M.; Barrett-Connor, Elizabeth; Benjamin, Emelia J.; Bjorkelund, Cecilia; Brenner, Hermann; Brunner, Eric; Clarke, Robert; Cooper, Jackie A.; Cremer, Peter; Cushman, Mary; Dagenais, Gilles R.; D'Agostino, Ralph B.; Dankner, Rachel; Davey-Smith, George; Deeg, Dorly; Dekker, Jacqueline M.; Engstrom, Gunnar; Folsom, Aaron R.; Fowkes, F. Gerry R.; Gallacher, John; Gaziano, J. Michael; Giampaoli, Simona; Gillum, Richard F.; Hofman, Albert; Howard, Barbara V.; Ingelsson, Erik; Iso, Hiroyasu; Jorgensen, Torben; Kiechl, Stefan; Kitamura, Akihiko; Kiyohara, Yutaka; Koenig, Wolfgang; Kromhout, Daan; Kuller, Lewis H.; Lawlor, Debbie A.; Meade, Tom W.

2012-01-01

Background There is debate about the value of assessing levels of C-reactive protein (CRP) and other biomarkers of inflammation for the prediction of first cardiovascular events. Methods We analyzed data from 52 prospective studies that included 246,669 participants without a history of
Prediction of thermodynamic instabilities of protein solutions from simple protein–protein interactions

International Nuclear Information System (INIS)

D’Agostino, Tommaso; Solana, José Ramón; Emanuele, Antonio

2013-01-01

Highlights: ► We propose a model of effective protein–protein interaction embedding solvent effects. ► A previous square-well model is enhanced by giving to the interaction a free energy character. ► The temperature dependence of the interaction is due to entropic effects of the solvent. ► The validity of the original SW model is extended to entropy driven phase transitions. ► We get good fits for lysozyme and haemoglobin spinodal data taken from literature. - Abstract: Statistical thermodynamics of protein solutions is often studied in terms of simple, microscopic models of particles interacting via pairwise potentials. Such modelling can reproduce the short range structure of protein solutions at equilibrium and predict thermodynamics instabilities of these systems. We introduce a square well model of effective protein–protein interaction that embeds the solvent’s action. We modify an existing model [45] by considering a well depth having an explicit dependence on temperature, i.e. an explicit free energy character, thus encompassing the statistically relevant configurations of solvent molecules around proteins. We choose protein solutions exhibiting demixing upon temperature decrease (lysozyme, enthalpy driven) and upon temperature increase (haemoglobin, entropy driven). We obtain satisfactory fits of spinodal curves for both the two proteins without adding any mean field term, thus extending the validity of the original model. Our results underline the solvent role in modulating or stretching the interaction potential
Metabolizable energy of corn hybrids for broiler chickens at different ages Energia metabolizável de milhos híbridos para frangos de corte em diferentes idades

Directory of Open Access Journals (Sweden)

Reinaldo Kanji Kato

2011-12-01

Full Text Available We determined the values of apparent metabolizable (AME, apparent corrected (AMEn, true (TME and true corrected (TMEn energy of six corn hybrids for broiler chickens in phases 1-7, 8-14, 15-21, 22-28, 29-35 and 36-42 day-old birds, using the substitution method (40% of reference diet with the test ingredient. Ross-308 male chicks (1,134 were allotted to metabolism cages and the number of birds per experimental unit was adjusted to suit each bird's density stage in the cage, using six replicates. Simultaneously, birds continue to fast for the determination of metabolic and endogenous losses for each study phase. The birds received water and food ad libitum during the experimental period. The birds were maintained in metabolism cages for seven days, four days for adaptation to the cage and food, and three days for excreta collection. The corn energy values were significantly lower only in the pre-initial phase (1-7 days. Thus, broiler feed formulations of AMEn values for corn of 3563 kcal/kg DM for 1 to 7 days and 3778 kcal/kg DM from 7-day-old birds are recommended.The agronomic characteristics of the corn had no influence on the birds energy levels.Foram determinados os valores de energia metabolizável aparente (EMA, aparente corrigida (EMAn, verdadeira (EMV e verdadeira corrigida (EMVn de seis milhos híbridos para frangos de corte nas fases de 1 a 7, 8 a 14, 15 a 21, 22 a 28, 29 a 35 e 36 a 42 dias de idade das aves, usando o método de substituição (40% da ração referência pelo ingrediente em teste. Pintos machos Ross-308 (1.134 foram distribuídos em gaiolas de metabolismo, sendo o número de aves por parcela, ajustado em cada fase para adequação da densidade de aves na gaiola, sendo utilizadas 6 repetições por tratamento. Simultaneamente, foram mantidas aves em jejum para a determinação das perdas endógenas e metabólicas para cada fase do estudo. As aves receberam água e ração à vontade durante todo período experimental
Limitations of polyethylene glycol-induced precipitation as predictive tool for protein solubility during formulation development.

Science.gov (United States)

Hofmann, Melanie; Winzer, Matthias; Weber, Christian; Gieseler, Henning

2018-05-01

Polyethylene glycol (PEG)-induced protein precipitation is often used to extrapolate apparent protein solubility at specific formulation compositions. The procedure was used for several fields of application such as protein crystal growth but also protein formulation development. Nevertheless, most studies focused on applicability in protein crystal growth. In contrast, this study focuses on applicability of PEG-induced precipitation during high-concentration protein formulation development. In this study, solubility of three different model proteins was investigated over a broad range of pH. Solubility values predicted by PEG-induced precipitation were compared to real solubility behaviour determined by either turbidity or content measurements. Predicted solubility by PEG-induced precipitation was confirmed for an Fc fusion protein and a monoclonal antibody. In contrast, PEG-induced precipitation failed to predict solubility of a single-domain antibody construct. Applicability of PEG-induced precipitation as indicator of protein solubility during formulation development was found to be not valid for one of three model molecules. Under certain conditions, PEG-induced protein precipitation is not valid for prediction of real protein solubility behaviour. The procedure should be used carefully as tool for formulation development, and the results obtained should be validated by additional investigations. © 2017 Royal Pharmaceutical Society.
PCVMZM: Using the Probabilistic Classification Vector Machines Model Combined with a Zernike Moments Descriptor to Predict Protein-Protein Interactions from Protein Sequences.

Science.gov (United States)

Wang, Yanbin; You, Zhuhong; Li, Xiao; Chen, Xing; Jiang, Tonghai; Zhang, Jingting

2017-05-11

Protein-protein interactions (PPIs) are essential for most living organisms' process. Thus, detecting PPIs is extremely important to understand the molecular mechanisms of biological systems. Although many PPIs data have been generated by high-throughput technologies for a variety of organisms, the whole interatom is still far from complete. In addition, the high-throughput technologies for detecting PPIs has some unavoidable defects, including time consumption, high cost, and high error rate. In recent years, with the development of machine learning, computational methods have been broadly used to predict PPIs, and can achieve good prediction rate. In this paper, we present here PCVMZM, a computational method based on a Probabilistic Classification Vector Machines (PCVM) model and Zernike moments (ZM) descriptor for predicting the PPIs from protein amino acids sequences. Specifically, a Zernike moments (ZM) descriptor is used to extract protein evolutionary information from Position-Specific Scoring Matrix (PSSM) generated by Position-Specific Iterated Basic Local Alignment Search Tool (PSI-BLAST). Then, PCVM classifier is used to infer the interactions among protein. When performed on PPIs datasets of Yeast and H. Pylori , the proposed method can achieve the average prediction accuracy of 94.48% and 91.25%, respectively. In order to further evaluate the performance of the proposed method, the state-of-the-art support vector machines (SVM) classifier is used and compares with the PCVM model. Experimental results on the Yeast dataset show that the performance of PCVM classifier is better than that of SVM classifier. The experimental results indicate that our proposed method is robust, powerful and feasible, which can be used as a helpful tool for proteomics research.
Combining neural networks for protein secondary structure prediction

DEFF Research Database (Denmark)

Riis, Søren Kamaric

1995-01-01

In this paper structured neural networks are applied to the problem of predicting the secondary structure of proteins. A hierarchical approach is used where specialized neural networks are designed for each structural class and then combined using another neural network. The submodels are designed...... by using a priori knowledge of the mapping between protein building blocks and the secondary structure and by using weight sharing. Since none of the individual networks have more than 600 adjustable weights over-fitting is avoided. When ensembles of specialized experts are combined the performance...
Prediction of residue-residue contact matrix for protein-protein interaction with Fisher score features and deep learning.

Science.gov (United States)

Du, Tianchuan; Liao, Li; Wu, Cathy H; Sun, Bilin

2016-11-01

Protein-protein interactions play essential roles in many biological processes. Acquiring knowledge of the residue-residue contact information of two interacting proteins is not only helpful in annotating functions for proteins, but also critical for structure-based drug design. The prediction of the protein residue-residue contact matrix of the interfacial regions is challenging. In this work, we introduced deep learning techniques (specifically, stacked autoencoders) to build deep neural network models to tackled the residue-residue contact prediction problem. In tandem with interaction profile Hidden Markov Models, which was used first to extract Fisher score features from protein sequences, stacked autoencoders were deployed to extract and learn hidden abstract features. The deep learning model showed significant improvement over the traditional machine learning model, Support Vector Machines (SVM), with the overall accuracy increased by 15% from 65.40% to 80.82%. We showed that the stacked autoencoders could extract novel features, which can be utilized by deep neural networks and other classifiers to enhance learning, out of the Fisher score features. It is further shown that deep neural networks have significant advantages over SVM in making use of the newly extracted features. Copyright © 2016. Published by Elsevier Inc.
Prediction of interactions between viral and host proteins using supervised machine learning methods.

Directory of Open Access Journals (Sweden)

Ranjan Kumar Barman

Full Text Available BACKGROUND: Viral-host protein-protein interaction plays a vital role in pathogenesis, since it defines viral infection of the host and regulation of the host proteins. Identification of key viral-host protein-protein interactions (PPIs has great implication for therapeutics. METHODS: In this study, a systematic attempt has been made to predict viral-host PPIs by integrating different features, including domain-domain association, network topology and sequence information using viral-host PPIs from VirusMINT. The three well-known supervised machine learning methods, such as SVM, Naïve Bayes and Random Forest, which are commonly used in the prediction of PPIs, were employed to evaluate the performance measure based on five-fold cross validation techniques. RESULTS: Out of 44 descriptors, best features were found to be domain-domain association and methionine, serine and valine amino acid composition of viral proteins. In this study, SVM-based method achieved better sensitivity of 67% over Naïve Bayes (37.49% and Random Forest (55.66%. However the specificity of Naïve Bayes was the highest (99.52% as compared with SVM (74% and Random Forest (89.08%. Overall, the SVM and Random Forest achieved accuracy of 71% and 72.41%, respectively. The proposed SVM-based method was evaluated on blind dataset and attained a sensitivity of 64%, specificity of 83%, and accuracy of 74%. In addition, unknown potential targets of hepatitis B virus-human and hepatitis E virus-human PPIs have been predicted through proposed SVM model and validated by gene ontology enrichment analysis. Our proposed model shows that, hepatitis B virus "C protein" binds to membrane docking protein, while "X protein" and "P protein" interacts with cell-killing and metabolic process proteins, respectively. CONCLUSION: The proposed method can predict large scale interspecies viral-human PPIs. The nature and function of unknown viral proteins (HBV and HEV, interacting partners of host
Topology and weights in a protein domain interaction network--a novel way to predict protein interactions.

Science.gov (United States)

Wuchty, Stefan

2006-05-23

While the analysis of unweighted biological webs as diverse as genetic, protein and metabolic networks allowed spectacular insights in the inner workings of a cell, biological networks are not only determined by their static grid of links. In fact, we expect that the heterogeneity in the utilization of connections has a major impact on the organization of cellular activities as well. We consider a web of interactions between protein domains of the Protein Family database (PFAM), which are weighted by a probability score. We apply metrics that combine the static layout and the weights of the underlying interactions. We observe that unweighted measures as well as their weighted counterparts largely share the same trends in the underlying domain interaction network. However, we only find weak signals that weights and the static grid of interactions are connected entities. Therefore assuming that a protein interaction is governed by a single domain interaction, we observe strong and significant correlations of the highest scoring domain interaction and the confidence of protein interactions in the underlying interactions of yeast and fly. Modeling an interaction between proteins if we find a high scoring protein domain interaction we obtain 1, 428 protein interactions among 361 proteins in the human malaria parasite Plasmodium falciparum. Assessing their quality by a logistic regression method we observe that increasing confidence of predicted interactions is accompanied by high scoring domain interactions and elevated levels of functional similarity and evolutionary conservation. Our results indicate that probability scores are randomly distributed, allowing to treat static grid and weights of domain interactions as separate entities. In particular, these finding confirms earlier observations that a protein interaction is a matter of a single interaction event on domain level. As an immediate application, we show a simple way to predict potential protein interactions
DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier

KAUST Repository

Kulmanov, Maxat

2017-09-27

Motivation A large number of protein sequences are becoming available through the application of novel high-throughput sequencing technologies. Experimental functional characterization of these proteins is time-consuming and expensive, and is often only done rigorously for few selected model organisms. Computational function prediction approaches have been suggested to fill this gap. The functions of proteins are classified using the Gene Ontology (GO), which contains over 40 000 classes. Additionally, proteins have multiple functions, making function prediction a large-scale, multi-class, multi-label problem. Results We have developed a novel method to predict protein function from sequence. We use deep learning to learn features from protein sequences as well as a cross-species protein–protein interaction network. Our approach specifically outputs information in the structure of the GO and utilizes the dependencies between GO classes as background information to construct a deep learning model. We evaluate our method using the standards established by the Computational Assessment of Function Annotation (CAFA) and demonstrate a significant improvement over baseline methods such as BLAST, in particular for predicting cellular locations.
AptRank: an adaptive PageRank model for protein function prediction on bi-relational graphs.

Science.gov (United States)

Jiang, Biaobin; Kloster, Kyle; Gleich, David F; Gribskov, Michael

2017-06-15

Diffusion-based network models are widely used for protein function prediction using protein network data and have been shown to outperform neighborhood-based and module-based methods. Recent studies have shown that integrating the hierarchical structure of the Gene Ontology (GO) data dramatically improves prediction accuracy. However, previous methods usually either used the GO hierarchy to refine the prediction results of multiple classifiers, or flattened the hierarchy into a function-function similarity kernel. No study has taken the GO hierarchy into account together with the protein network as a two-layer network model. We first construct a Bi-relational graph (Birg) model comprised of both protein-protein association and function-function hierarchical networks. We then propose two diffusion-based methods, BirgRank and AptRank, both of which use PageRank to diffuse information on this two-layer graph model. BirgRank is a direct application of traditional PageRank with fixed decay parameters. In contrast, AptRank utilizes an adaptive diffusion mechanism to improve the performance of BirgRank. We evaluate the ability of both methods to predict protein function on yeast, fly and human protein datasets, and compare with four previous methods: GeneMANIA, TMC, ProteinRank and clusDCA. We design four different validation strategies: missing function prediction, de novo function prediction, guided function prediction and newly discovered function prediction to comprehensively evaluate predictability of all six methods. We find that both BirgRank and AptRank outperform the previous methods, especially in missing function prediction when using only 10% of the data for training. The MATLAB code is available at https://github.rcac.purdue.edu/mgribsko/aptrank . gribskov@purdue.edu. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Accurate microRNA target prediction correlates with protein repression levels

Directory of Open Access Journals (Sweden)

Simossis Victor A

2009-09-01

Full Text Available Abstract Background MicroRNAs are small endogenously expressed non-coding RNA molecules that regulate target gene expression through translation repression or messenger RNA degradation. MicroRNA regulation is performed through pairing of the microRNA to sites in the messenger RNA of protein coding genes. Since experimental identification of miRNA target genes poses difficulties, computational microRNA target prediction is one of the key means in deciphering the role of microRNAs in development and disease. Results DIANA-microT 3.0 is an algorithm for microRNA target prediction which is based on several parameters calculated individually for each microRNA and combines conserved and non-conserved microRNA recognition elements into a final prediction score, which correlates with protein production fold change. Specifically, for each predicted interaction the program reports a signal to noise ratio and a precision score which can be used as an indication of the false positive rate of the prediction. Conclusion Recently, several computational target prediction programs were benchmarked based on a set of microRNA target genes identified by the pSILAC method. In this assessment DIANA-microT 3.0 was found to achieve the highest precision among the most widely used microRNA target prediction programs reaching approximately 66%. The DIANA-microT 3.0 prediction results are available online in a user friendly web server at http://www.microrna.gr/microT
Coeficientes de metabolizabilidade da energia bruta de diferentes ingredientes para frangos de corte Coefficient of metabolizability of gross energy of different ingredients for broiler chickens

Directory of Open Access Journals (Sweden)

Ricardo Vianna Nunes

2008-01-01

Full Text Available Com o objetivo de determinar os valores de energia metabolizável aparente (EMA, aparente corrigida (EMAn, verdadeira (EMV e verdadeira corrigida (EMVn de oito ingredientes e seus respectivos coeficientes de metabolizabilidade, foram utilizados 300 frangos de corte machos Ross, com 21 dias de idade, distribuídos em oito tratamentos (alimentos e uma ração-referência, em três blocos com duas repetições por bloco e cinco aves por unidade experimental. Os alimentos avaliados foram: duas amostras de grão de trigo (TI, duas de farelo de trigo (FT, duas de grão de milho (MI, uma de grão de sorgo (SO e uma de farelo de glúten de milho 21% (21% FGM, as quais substituíram 30% da ração-referência. As aves receberam ração à vontade por 12 dias, de modo que os cinco dias finais foram destinados à coleta de excretas. Durante os cinco dias de coleta, 30 aves distribuídas em seis gaiolas foram mantidas em jejum por 72 horas (as 24 horas iniciais para esvaziamento do trato gastrointestinal e as 48 horas restantes para coleta das excretas, que foram quantificadas e extrapoladas para cinco dias. Os valores de EMA e EMAn, em kcal/kg MS, foram em média de 3.391 e 3.275 para o TI, de 2.076 e 1.996 para o FT, de 3.862 e 3.768 para o MI, de 3.551 e 3.464 para o SO e de 1.992 a 1.901 para o 21% FGM. Os valores de EMV e EMV, em nkcal/kg MS, foram em média de 3.495 e 3.496 para o TI, de 2.195 e 2.146 para o FT, de 3.981 e 4.040 para o MI, de 3.652 e 3.680 para o sorgo e de 2.117 a 1.961 para o 21% FGM. Os coeficientes de metabolizabilidade para energia bruta foram em média de 68,94%.With the objective of determine the apparent metabolizable energy (AME, corrected apparent (AMEn, true metabolizable energy (TMEn and corrected true (TME values of eight feedstuffs and their respective coefficient of metabolizability, 300 male broiler chickens, Ross, averaging 21 days old, were assigned to eight treatment (feeds and one reference diet, in three blocks
Eficiência de utilização da energia metabolizável em bovinos Nelore puros e cruzados submetidos a quatro níveis de concentrado na ração Net efficiency of metabolizable energy utilization of purebred and crossbred Nellore young bulls fed diets with different concentrate levels

Directory of Open Access Journals (Sweden)

José Antônio de Freitas

2006-06-01

Full Text Available Objetivou-se com este trabalho estimar as eficiências de utilização da energia metabolizável para mantença (Km e ganho de peso (Kg de bovinos Nelore puros e mestiços. Foram utilizados 72 bovinos machos, não-castrados, com idade inicial de 10 a 11 meses (18 Nelore, 18 F1 Nelore x Angus, 18 F1 Nelore x Pardo-Suíço e 18 F1 Nelore x Simental e peso médio inicial de 286, 309, 333 e 310 kg, respectivamente. Adotou-se o delineamento inteiramente casualizado em arranjo fatorial 4 x 4 m com três animais por grupo genético e quatro níveis de adição de concentrado (30, 40, 60 e 70% na MS. Três animais de cada grupo genético foram alocados no grupo mantença e três foram abatidos no início do experimento. O consumo de energia metabolizável de mantença (CEMm, em kcal/kg0,75, correspondeu ao ponto no qual o coeficiente entre a produção de calor em jejum (PCj e os CEM foram mais próximos de 1. As eficiências de utilização da EM para mantença (Km foram estimadas pela divisão da produção de calor em jejum pelo CEMm. A eficiência de utilização da EM para ganho de peso (kg foi estimada pela regressão entre a energia retida (kcal/kg0,75 e o CEMg. As exigências de EM foram obtidas dividindo-se as exigências líquidas pelo valor de Km. Não houve influência significativa dos grupos genéticos e dos níveis de concentrado na ração sobre Km e Kg, que apresentaram valores de 0,67 e 0,40, respectivamente. As exigências de EM para ganho (EMg e de EM total (EMt aumentaram com a elevação do peso vivo (PV. Por outro lado, as EMt e EMg por unidade de PCV decresceram com o aumento do PV, indicando maior eficiência de utilização da EM com a elevação do peso vivo dos animais.The objective of this trial was to estimate the efficiency of utilization of metabolizable energy (MEEU for maintenance (Km and weight gain (kg of feedlot purebred and crossbred Nellore. Seventy-two young bulls averaging 10 to 11 months of age from four genetic
Predicting beta-turns in proteins using support vector machines with fractional polynomials.

Science.gov (United States)

Elbashir, Murtada; Wang, Jianxin; Wu, Fang-Xiang; Wang, Lusheng

2013-11-07

β-turns are secondary structure type that have essential role in molecular recognition, protein folding, and stability. They are found to be the most common type of non-repetitive structures since 25% of amino acids in protein structures are situated on them. Their prediction is considered to be one of the crucial problems in bioinformatics and molecular biology, which can provide valuable insights and inputs for the fold recognition and drug design. We propose an approach that combines support vector machines (SVMs) and logistic regression (LR) in a hybrid prediction method, which we call (H-SVM-LR) to predict β-turns in proteins. Fractional polynomials are used for LR modeling. We utilize position specific scoring matrices (PSSMs) and predicted secondary structure (PSS) as features. Our simulation studies show that H-SVM-LR achieves Qtotal of 82.87%, 82.84%, and 82.32% on the BT426, BT547, and BT823 datasets respectively. These values are the highest among other β-turns prediction methods that are based on PSSMs and secondary structure information. H-SVM-LR also achieves favorable performance in predicting β-turns as measured by the Matthew's correlation coefficient (MCC) on these datasets. Furthermore, H-SVM-LR shows good performance when considering shape strings as additional features. In this paper, we present a comprehensive approach for β-turns prediction. Experiments show that our proposed approach achieves better performance compared to other competing prediction methods.
LoopIng: a template-based tool for predicting the structure of protein loops.

KAUST Repository

Messih, Mario Abdel

2015-08-06

Predicting the structure of protein loops is very challenging, mainly because they are not necessarily subject to strong evolutionary pressure. This implies that, unlike the rest of the protein, standard homology modeling techniques are not very effective in modeling their structure. However, loops are often involved in protein function, hence inferring their structure is important for predicting protein structure as well as function.We describe a method, LoopIng, based on the Random Forest automated learning technique, which, given a target loop, selects a structural template for it from a database of loop candidates. Compared to the most recently available methods, LoopIng is able to achieve similar accuracy for short loops (4-10 residues) and significant enhancements for long loops (11-20 residues). The quality of the predictions is robust to errors that unavoidably affect the stem regions when these are modeled. The method returns a confidence score for the predicted template loops and has the advantage of being very fast (on average: 1 min/loop).www.biocomputing.it/loopinganna.tramontano@uniroma1.itSupplementary data are available at Bioinformatics online.
In silico platform for predicting and initiating β-turns in a protein at desired locations.

Science.gov (United States)

Singh, Harinder; Singh, Sandeep; Raghava, Gajendra P S

2015-05-01

Numerous studies have been performed for analysis and prediction of β-turns in a protein. This study focuses on analyzing, predicting, and designing of β-turns to understand the preference of amino acids in β-turn formation. We analyzed around 20,000 PDB chains to understand the preference of residues or pair of residues at different positions in β-turns. Based on the results, a propensity-based method has been developed for predicting β-turns with an accuracy of 82%. We introduced a new approach entitled "Turn level prediction method," which predicts the complete β-turn rather than focusing on the residues in a β-turn. Finally, we developed BetaTPred3, a Random forest based method for predicting β-turns by utilizing various features of four residues present in β-turns. The BetaTPred3 achieved an accuracy of 79% with 0.51 MCC that is comparable or better than existing methods on BT426 dataset. Additionally, models were developed to predict β-turn types with better performance than other methods available in the literature. In order to improve the quality of prediction of turns, we developed prediction models on a large and latest dataset of 6376 nonredundant protein chains. Based on this study, a web server has been developed for prediction of β-turns and their types in proteins. This web server also predicts minimum number of mutations required to initiate or break a β-turn in a protein at specified location of a protein. © 2015 Wiley Periodicals, Inc.
BLProt: Prediction of bioluminescent proteins based on support vector machine and relieff feature selection

KAUST Repository

Kandaswamy, Krishna Kumar

2011-08-17

Background: Bioluminescence is a process in which light is emitted by a living organism. Most creatures that emit light are sea creatures, but some insects, plants, fungi etc, also emit light. The biotechnological application of bioluminescence has become routine and is considered essential for many medical and general technological advances. Identification of bioluminescent proteins is more challenging due to their poor similarity in sequence. So far, no specific method has been reported to identify bioluminescent proteins from primary sequence.Results: In this paper, we propose a novel predictive method that uses a Support Vector Machine (SVM) and physicochemical properties to predict bioluminescent proteins. BLProt was trained using a dataset consisting of 300 bioluminescent proteins and 300 non-bioluminescent proteins, and evaluated by an independent set of 141 bioluminescent proteins and 18202 non-bioluminescent proteins. To identify the most prominent features, we carried out feature selection with three different filter approaches, ReliefF, infogain, and mRMR. We selected five different feature subsets by decreasing the number of features, and the performance of each feature subset was evaluated.Conclusion: BLProt achieves 80% accuracy from training (5 fold cross-validations) and 80.06% accuracy from testing. The performance of BLProt was compared with BLAST and HMM. High prediction accuracy and successful prediction of hypothetical proteins suggests that BLProt can be a useful approach to identify bioluminescent proteins from sequence information, irrespective of their sequence similarity. 2011 Kandaswamy et al; licensee BioMed Central Ltd.
BLProt: Prediction of bioluminescent proteins based on support vector machine and relieff feature selection

KAUST Repository

Kandaswamy, Krishna Kumar; Pugalenthi, Ganesan; Hazrati, Mehrnaz Khodam; Kalies, Kai-Uwe; Martinetz, Thomas

2011-01-01

Background: Bioluminescence is a process in which light is emitted by a living organism. Most creatures that emit light are sea creatures, but some insects, plants, fungi etc, also emit light. The biotechnological application of bioluminescence has become routine and is considered essential for many medical and general technological advances. Identification of bioluminescent proteins is more challenging due to their poor similarity in sequence. So far, no specific method has been reported to identify bioluminescent proteins from primary sequence.Results: In this paper, we propose a novel predictive method that uses a Support Vector Machine (SVM) and physicochemical properties to predict bioluminescent proteins. BLProt was trained using a dataset consisting of 300 bioluminescent proteins and 300 non-bioluminescent proteins, and evaluated by an independent set of 141 bioluminescent proteins and 18202 non-bioluminescent proteins. To identify the most prominent features, we carried out feature selection with three different filter approaches, ReliefF, infogain, and mRMR. We selected five different feature subsets by decreasing the number of features, and the performance of each feature subset was evaluated.Conclusion: BLProt achieves 80% accuracy from training (5 fold cross-validations) and 80.06% accuracy from testing. The performance of BLProt was compared with BLAST and HMM. High prediction accuracy and successful prediction of hypothetical proteins suggests that BLProt can be a useful approach to identify bioluminescent proteins from sequence information, irrespective of their sequence similarity. 2011 Kandaswamy et al; licensee BioMed Central Ltd.
Experimental-confirmation and functional-annotation of predicted proteins in the chicken genome

Directory of Open Access Journals (Sweden)

McCarthy Fiona M

2007-11-01

Full Text Available Abstract Background The chicken genome was sequenced because of its phylogenetic position as a non-mammalian vertebrate, its use as a biomedical model especially to study embryology and development, its role as a source of human disease organisms and its importance as the major source of animal derived food protein. However, genomic sequence data is, in itself, of limited value; generally it is not equivalent to understanding biological function. The benefit of having a genome sequence is that it provides a basis for functional genomics. However, the sequence data currently available is poorly structurally and functionally annotated and many genes do not have standard nomenclature assigned. Results We analysed eight chicken tissues and improved the chicken genome structural annotation by providing experimental support for the in vivo expression of 7,809 computationally predicted proteins, including 30 chicken proteins that were only electronically predicted or hypothetical translations in human. To improve functional annotation (based on Gene Ontology, we mapped these identified proteins to their human and mouse orthologs and used this orthology to transfer Gene Ontology (GO functional annotations to the chicken proteins. The 8,213 orthology-based GO annotations that we produced represent an 8% increase in currently available chicken GO annotations. Orthologous chicken products were also assigned standardized nomenclature based on current chicken nomenclature guidelines. Conclusion We demonstrate the utility of high-throughput expression proteomics for rapid experimental structural annotation of a newly sequenced eukaryote genome. These experimentally-supported predicted proteins were further annotated by assigning the proteins with standardized nomenclature and functional annotation. This method is widely applicable to a diverse range of species. Moreover, information from one genome can be used to improve the annotation of other genomes and

Sequence- and interactome-based prediction of viral protein hotspots targeting host proteins: a case study for HIV Nef.

Directory of Open Access Journals (Sweden)

Mahdi Sarmady

Full Text Available Virus proteins alter protein pathways of the host toward the synthesis of viral particles by breaking and making edges via binding to host proteins. In this study, we developed a computational approach to predict viral sequence hotspots for binding to host proteins based on sequences of viral and host proteins and literature-curated virus-host protein interactome data. We use a motif discovery algorithm repeatedly on collections of sequences of viral proteins and immediate binding partners of their host targets and choose only those motifs that are conserved on viral sequences and highly statistically enriched among binding partners of virus protein targeted host proteins. Our results match experimental data on binding sites of Nef to host proteins such as MAPK1, VAV1, LCK, HCK, HLA-A, CD4, FYN, and GNB2L1 with high statistical significance but is a poor predictor of Nef binding sites on highly flexible, hoop-like regions. Predicted hotspots recapture CD8 cell epitopes of HIV Nef highlighting their importance in modulating virus-host interactions. Host proteins potentially targeted or outcompeted by Nef appear crowding the T cell receptor, natural killer cell mediated cytotoxicity, and neurotrophin signaling pathways. Scanning of HIV Nef motifs on multiple alignments of hepatitis C protein NS5A produces results consistent with literature, indicating the potential value of the hotspot discovery in advancing our understanding of virus-host crosstalk.
Aspects of Energy Metabolism in Mangalitsa Pigs Exposed at Thermic Neutral Temperature

Directory of Open Access Journals (Sweden)

Monica Pârvu

2011-10-01

Full Text Available The studies aimed the energy metabolism determination in Mangalitsa pigs exposed at thermic neutral temperature, compared to Large White pigs. The experimental period was between 80 and 100 kg liveweight. The animals had free access to standard, isoprotein and isocalory diets, with 13.5% crude protein (CP and 3100 kcal/kg metabolizable energy. Feed intake was measured on a daily basis. The energy-protein balance was calculated on the basis of comparative slaughter made at the beginning and end of the experiment. The metabolizable energy (MEc was estimated by chemical analysis (feed and excreta using mathematical modelling and the Whittemore’s formula. The metabolizable energy utilization efficiency was 0.61 at Large White and 0.53 at Mangalitsa.
Exploration of the omics evidence landscape: adding qualitative labels to predicted protein-protein interactions.

NARCIS (Netherlands)

Noort, V. van; Snel, B.; Huynen, M.A.

2007-01-01

BACKGROUND: In the post-genomic era various functional genomics, proteomics and computational techniques have been developed to elucidate the protein interaction network. While some of these techniques are specific for a certain type of interaction, most predict a mixture of interactions.
ngLOC: software and web server for predicting protein subcellular localization in prokaryotes and eukaryotes

Directory of Open Access Journals (Sweden)

King Brian R

2012-07-01

Full Text Available Abstract Background Understanding protein subcellular localization is a necessary component toward understanding the overall function of a protein. Numerous computational methods have been published over the past decade, with varying degrees of success. Despite the large number of published methods in this area, only a small fraction of them are available for researchers to use in their own studies. Of those that are available, many are limited by predicting only a small number of organelles in the cell. Additionally, the majority of methods predict only a single location for a sequence, even though it is known that a large fraction of the proteins in eukaryotic species shuttle between locations to carry out their function. Findings We present a software package and a web server for predicting the subcellular localization of protein sequences based on the ngLOC method. ngLOC is an n-gram-based Bayesian classifier that predicts subcellular localization of proteins both in prokaryotes and eukaryotes. The overall prediction accuracy varies from 89.8% to 91.4% across species. This program can predict 11 distinct locations each in plant and animal species. ngLOC also predicts 4 and 5 distinct locations on gram-positive and gram-negative bacterial datasets, respectively. Conclusions ngLOC is a generic method that can be trained by data from a variety of species or classes for predicting protein subcellular localization. The standalone software is freely available for academic use under GNU GPL, and the ngLOC web server is also accessible at http://ngloc.unmc.edu.
De novo protein structure prediction by dynamic fragment assembly and conformational space annealing.

Science.gov (United States)

Lee, Juyong; Lee, Jinhyuk; Sasaki, Takeshi N; Sasai, Masaki; Seok, Chaok; Lee, Jooyoung

2011-08-01

Ab initio protein structure prediction is a challenging problem that requires both an accurate energetic representation of a protein structure and an efficient conformational sampling method for successful protein modeling. In this article, we present an ab initio structure prediction method which combines a recently suggested novel way of fragment assembly, dynamic fragment assembly (DFA) and conformational space annealing (CSA) algorithm. In DFA, model structures are scored by continuous functions constructed based on short- and long-range structural restraint information from a fragment library. Here, DFA is represented by the full-atom model by CHARMM with the addition of the empirical potential of DFIRE. The relative contributions between various energy terms are optimized using linear programming. The conformational sampling was carried out with CSA algorithm, which can find low energy conformations more efficiently than simulated annealing used in the existing DFA study. The newly introduced DFA energy function and CSA sampling algorithm are implemented into CHARMM. Test results on 30 small single-domain proteins and 13 template-free modeling targets of the 8th Critical Assessment of protein Structure Prediction show that the current method provides comparable and complementary prediction results to existing top methods. Copyright © 2011 Wiley-Liss, Inc.
Sequential search leads to faster, more efficient fragment-based de novo protein structure prediction.

Science.gov (United States)

de Oliveira, Saulo H P; Law, Eleanor C; Shi, Jiye; Deane, Charlotte M

2018-04-01

Most current de novo structure prediction methods randomly sample protein conformations and thus require large amounts of computational resource. Here, we consider a sequential sampling strategy, building on ideas from recent experimental work which shows that many proteins fold cotranslationally. We have investigated whether a pseudo-greedy search approach, which begins sequentially from one of the termini, can improve the performance and accuracy of de novo protein structure prediction. We observed that our sequential approach converges when fewer than 20 000 decoys have been produced, fewer than commonly expected. Using our software, SAINT2, we also compared the run time and quality of models produced in a sequential fashion against a standard, non-sequential approach. Sequential prediction produces an individual decoy 1.5-2.5 times faster than non-sequential prediction. When considering the quality of the best model, sequential prediction led to a better model being produced for 31 out of 41 soluble protein validation cases and for 18 out of 24 transmembrane protein cases. Correct models (TM-Score > 0.5) were produced for 29 of these cases by the sequential mode and for only 22 by the non-sequential mode. Our comparison reveals that a sequential search strategy can be used to drastically reduce computational time of de novo protein structure prediction and improve accuracy. Data are available for download from: http://opig.stats.ox.ac.uk/resources. SAINT2 is available for download from: https://github.com/sauloho/SAINT2. saulo.deoliveira@dtc.ox.ac.uk. Supplementary data are available at Bioinformatics online.
StaRProtein, A Web Server for Prediction of the Stability of Repeat Proteins

Science.gov (United States)

Xu, Yongtao; Zhou, Xu; Huang, Meilan

2015-01-01

Repeat proteins have become increasingly important due to their capability to bind to almost any proteins and the potential as alternative therapy to monoclonal antibodies. In the past decade repeat proteins have been designed to mediate specific protein-protein interactions. The tetratricopeptide and ankyrin repeat proteins are two classes of helical repeat proteins that form different binding pockets to accommodate various partners. It is important to understand the factors that define folding and stability of repeat proteins in order to prioritize the most stable designed repeat proteins to further explore their potential binding affinities. Here we developed distance-dependant statistical potentials using two classes of alpha-helical repeat proteins, tetratricopeptide and ankyrin repeat proteins respectively, and evaluated their efficiency in predicting the stability of repeat proteins. We demonstrated that the repeat-specific statistical potentials based on these two classes of repeat proteins showed paramount accuracy compared with non-specific statistical potentials in: 1) discriminate correct vs. incorrect models 2) rank the stability of designed repeat proteins. In particular, the statistical scores correlate closely with the equilibrium unfolding free energies of repeat proteins and therefore would serve as a novel tool in quickly prioritizing the designed repeat proteins with high stability. StaRProtein web server was developed for predicting the stability of repeat proteins. PMID:25807112
Disorder Prediction Methods, Their Applicability to Different Protein Targets and Their Usefulness for Guiding Experimental Studies

Directory of Open Access Journals (Sweden)

Jennifer D. Atkins

2015-08-01

Full Text Available The role and function of a given protein is dependent on its structure. In recent years, however, numerous studies have highlighted the importance of unstructured, or disordered regions in governing a protein’s function. Disordered proteins have been found to play important roles in pivotal cellular functions, such as DNA binding and signalling cascades. Studying proteins with extended disordered regions is often problematic as they can be challenging to express, purify and crystallise. This means that interpretable experimental data on protein disorder is hard to generate. As a result, predictive computational tools have been developed with the aim of predicting the level and location of disorder within a protein. Currently, over 60 prediction servers exist, utilizing different methods for classifying disorder and different training sets. Here we review several good performing, publicly available prediction methods, comparing their application and discussing how disorder prediction servers can be used to aid the experimental solution of protein structure. The use of disorder prediction methods allows us to adopt a more targeted approach to experimental studies by accurately identifying the boundaries of ordered protein domains so that they may be investigated separately, thereby increasing the likelihood of their successful experimental solution.
BacHbpred: Support Vector Machine Methods for the Prediction of Bacterial Hemoglobin-Like Proteins

Directory of Open Access Journals (Sweden)

MuthuKrishnan Selvaraj

2016-01-01

Full Text Available The recent upsurge in microbial genome data has revealed that hemoglobin-like (HbL proteins may be widely distributed among bacteria and that some organisms may carry more than one HbL encoding gene. However, the discovery of HbL proteins has been limited to a small number of bacteria only. This study describes the prediction of HbL proteins and their domain classification using a machine learning approach. Support vector machine (SVM models were developed for predicting HbL proteins based upon amino acid composition (AC, dipeptide composition (DC, hybrid method (AC + DC, and position specific scoring matrix (PSSM. In addition, we introduce for the first time a new prediction method based on max to min amino acid residue (MM profiles. The average accuracy, standard deviation (SD, false positive rate (FPR, confusion matrix, and receiver operating characteristic (ROC were analyzed. We also compared the performance of our proposed models in homology detection databases. The performance of the different approaches was estimated using fivefold cross-validation techniques. Prediction accuracy was further investigated through confusion matrix and ROC curve analysis. All experimental results indicate that the proposed BacHbpred can be a perspective predictor for determination of HbL related proteins. BacHbpred, a web tool, has been developed for HbL prediction.
Exploration of the omics evidence landscape: adding qualitative labels to predicted protein-protein interactions

NARCIS (Netherlands)

Noort, V. van; Snel, B.; Huynen, M.A.

2007-01-01

ABSTRACT: BACKGROUND: In the post-genomic era various functional genomics, proteomics and computational techniques have been developed to elucidate the protein interaction network. While some of these techniques are specific for a certain type of interaction, most predict a mixture of interactions.
Prediction of flexible/rigid regions from protein sequences using k-spaced amino acid pairs

Directory of Open Access Journals (Sweden)

Ruan Jishou

2007-04-01

Full Text Available Abstract Background Traditionally, it is believed that the native structure of a protein corresponds to a global minimum of its free energy. However, with the growing number of known tertiary (3D protein structures, researchers have discovered that some proteins can alter their structures in response to a change in their surroundings or with the help of other proteins or ligands. Such structural shifts play a crucial role with respect to the protein function. To this end, we propose a machine learning method for the prediction of the flexible/rigid regions of proteins (referred to as FlexRP; the method is based on a novel sequence representation and feature selection. Knowledge of the flexible/rigid regions may provide insights into the protein folding process and the 3D structure prediction. Results The flexible/rigid regions were defined based on a dataset, which includes protein sequences that have multiple experimental structures, and which was previously used to study the structural conservation of proteins. Sequences drawn from this dataset were represented based on feature sets that were proposed in prior research, such as PSI-BLAST profiles, composition vector and binary sequence encoding, and a newly proposed representation based on frequencies of k-spaced amino acid pairs. These representations were processed by feature selection to reduce the dimensionality. Several machine learning methods for the prediction of flexible/rigid regions and two recently proposed methods for the prediction of conformational changes and unstructured regions were compared with the proposed method. The FlexRP method, which applies Logistic Regression and collocation-based representation with 95 features, obtained 79.5% accuracy. The two runner-up methods, which apply the same sequence representation and Support Vector Machines (SVM and Naïve Bayes classifiers, obtained 79.2% and 78.4% accuracy, respectively. The remaining considered methods are
Detrended cross-correlation coefficient: Application to predict apoptosis protein subcellular localization.

Science.gov (United States)

Liang, Yunyun; Liu, Sanyang; Zhang, Shengli

2016-12-01

Apoptosis, or programed cell death, plays a central role in the development and homeostasis of an organism. Obtaining information on subcellular location of apoptosis proteins is very helpful for understanding the apoptosis mechanism. The prediction of subcellular localization of an apoptosis protein is still a challenging task, and existing methods mainly based on protein primary sequences. In this paper, we introduce a new position-specific scoring matrix (PSSM)-based method by using detrended cross-correlation (DCCA) coefficient of non-overlapping windows. Then a 190-dimensional (190D) feature vector is constructed on two widely used datasets: CL317 and ZD98, and support vector machine is adopted as classifier. To evaluate the proposed method, objective and rigorous jackknife cross-validation tests are performed on the two datasets. The results show that our approach offers a novel and reliable PSSM-based tool for prediction of apoptosis protein subcellular localization. Copyright © 2016 Elsevier Inc. All rights reserved.
Contingency Table Browser - prediction of early stage protein structure.

Science.gov (United States)

Kalinowska, Barbara; Krzykalski, Artur; Roterman, Irena

2015-01-01

The Early Stage (ES) intermediate represents the starting structure in protein folding simulations based on the Fuzzy Oil Drop (FOD) model. The accuracy of FOD predictions is greatly dependent on the accuracy of the chosen intermediate. A suitable intermediate can be constructed using the sequence-structure relationship information contained in the so-called contingency table - this table expresses the likelihood of encountering various structural motifs for each tetrapeptide fragment in the amino acid sequence. The limited accuracy with which such structures could previously be predicted provided the motivation for a more indepth study of the contingency table itself. The Contingency Table Browser is a tool which can visualize, search and analyze the table. Our work presents possible applications of Contingency Table Browser, among them - analysis of specific protein sequences from the point of view of their structural ambiguity.
Structure Based Thermostability Prediction Models for Protein Single Point Mutations with Machine Learning Tools.

Directory of Open Access Journals (Sweden)

Lei Jia

Full Text Available Thermostability issue of protein point mutations is a common occurrence in protein engineering. An application which predicts the thermostability of mutants can be helpful for guiding decision making process in protein design via mutagenesis. An in silico point mutation scanning method is frequently used to find "hot spots" in proteins for focused mutagenesis. ProTherm (http://gibk26.bio.kyutech.ac.jp/jouhou/Protherm/protherm.html is a public database that consists of thousands of protein mutants' experimentally measured thermostability. Two data sets based on two differently measured thermostability properties of protein single point mutations, namely the unfolding free energy change (ddG and melting temperature change (dTm were obtained from this database. Folding free energy change calculation from Rosetta, structural information of the point mutations as well as amino acid physical properties were obtained for building thermostability prediction models with informatics modeling tools. Five supervised machine learning methods (support vector machine, random forests, artificial neural network, naïve Bayes classifier, K nearest neighbor and partial least squares regression are used for building the prediction models. Binary and ternary classifications as well as regression models were built and evaluated. Data set redundancy and balancing, the reverse mutations technique, feature selection, and comparison to other published methods were discussed. Rosetta calculated folding free energy change ranked as the most influential features in all prediction models. Other descriptors also made significant contributions to increasing the accuracy of the prediction models.
Prediction of protein subcellular localization using support vector machine with the choice of proper kernel

Directory of Open Access Journals (Sweden)

Al Mehedi Hasan

2017-07-01

Full Text Available The prediction of subcellular locations of proteins can provide useful hints for revealing their functions as well as for understanding the mechanisms of some diseases and, finally, for developing novel drugs. As the number of newly discovered proteins has been growing exponentially, laboratory-based experiments to determine the location of an uncharacterized protein in a living cell have become both expensive and time-consuming. Consequently, to tackle these challenges, computational methods are being developed as an alternative to help biologists in selecting target proteins and designing related experiments. However, the success of protein subcellular localization prediction is still a complicated and challenging problem, particularly when query proteins may have multi-label characteristics, i.e. their simultaneous existence in more than one subcellular location, or if they move between two or more different subcellular locations as well. At this point, to get rid of this problem, several types of subcellular localization prediction methods with different levels of accuracy have been proposed. The support vector machine (SVM has been employed to provide potential solutions for problems connected with the prediction of protein subcellular localization. However, the practicability of SVM is affected by difficulties in selecting its appropriate kernel as well as in selecting the parameters of that selected kernel. The literature survey has shown that most researchers apply the radial basis function (RBF kernel to build a SVM based subcellular localization prediction system. Surprisingly, there are still many other kernel functions which have not yet been applied in the prediction of protein subcellular localization. However, the nature of this classification problem requires the application of different kernels for SVM to ensure an optimal result. From this viewpoint, this paper presents the work to apply different kernels for SVM in protein
MultiLoc2: integrating phylogeny and Gene Ontology terms improves subcellular protein localization prediction

Directory of Open Access Journals (Sweden)

Kohlbacher Oliver

2009-09-01

Full Text Available Abstract Background Knowledge of subcellular localization of proteins is crucial to proteomics, drug target discovery and systems biology since localization and biological function are highly correlated. In recent years, numerous computational prediction methods have been developed. Nevertheless, there is still a need for prediction methods that show more robustness and higher accuracy. Results We extended our previous MultiLoc predictor by incorporating phylogenetic profiles and Gene Ontology terms. Two different datasets were used for training the system, resulting in two versions of this high-accuracy prediction method. One version is specialized for globular proteins and predicts up to five localizations, whereas a second version covers all eleven main eukaryotic subcellular localizations. In a benchmark study with five localizations, MultiLoc2 performs considerably better than other methods for animal and plant proteins and comparably for fungal proteins. Furthermore, MultiLoc2 performs clearly better when using a second dataset that extends the benchmark study to all eleven main eukaryotic subcellular localizations. Conclusion MultiLoc2 is an extensive high-performance subcellular protein localization prediction system. By incorporating phylogenetic profiles and Gene Ontology terms MultiLoc2 yields higher accuracies compared to its previous version. Moreover, it outperforms other prediction systems in two benchmarks studies. MultiLoc2 is available as user-friendly and free web-service, available at: http://www-bs.informatik.uni-tuebingen.de/Services/MultiLoc2.
A sparse autoencoder-based deep neural network for protein solvent accessibility and contact number prediction.

Science.gov (United States)

Deng, Lei; Fan, Chao; Zeng, Zhiwen

2017-12-28

Direct prediction of the three-dimensional (3D) structures of proteins from one-dimensional (1D) sequences is a challenging problem. Significant structural characteristics such as solvent accessibility and contact number are essential for deriving restrains in modeling protein folding and protein 3D structure. Thus, accurately predicting these features is a critical step for 3D protein structure building. In this study, we present DeepSacon, a computational method that can effectively predict protein solvent accessibility and contact number by using a deep neural network, which is built based on stacked autoencoder and a dropout method. The results demonstrate that our proposed DeepSacon achieves a significant improvement in the prediction quality compared with the state-of-the-art methods. We obtain 0.70 three-state accuracy for solvent accessibility, 0.33 15-state accuracy and 0.74 Pearson Correlation Coefficient (PCC) for the contact number on the 5729 monomeric soluble globular protein dataset. We also evaluate the performance on the CASP11 benchmark dataset, DeepSacon achieves 0.68 three-state accuracy and 0.69 PCC for solvent accessibility and contact number, respectively. We have shown that DeepSacon can reliably predict solvent accessibility and contact number with stacked sparse autoencoder and a dropout approach.
ProLanGO: Protein Function Prediction Using Neural Machine Translation Based on a Recurrent Neural Network.

Science.gov (United States)

Cao, Renzhi; Freitas, Colton; Chan, Leong; Sun, Miao; Jiang, Haiqing; Chen, Zhangxin

2017-10-17

With the development of next generation sequencing techniques, it is fast and cheap to determine protein sequences but relatively slow and expensive to extract useful information from protein sequences because of limitations of traditional biological experimental techniques. Protein function prediction has been a long standing challenge to fill the gap between the huge amount of protein sequences and the known function. In this paper, we propose a novel method to convert the protein function problem into a language translation problem by the new proposed protein sequence language "ProLan" to the protein function language "GOLan", and build a neural machine translation model based on recurrent neural networks to translate "ProLan" language to "GOLan" language. We blindly tested our method by attending the latest third Critical Assessment of Function Annotation (CAFA 3) in 2016, and also evaluate the performance of our methods on selected proteins whose function was released after CAFA competition. The good performance on the training and testing datasets demonstrates that our new proposed method is a promising direction for protein function prediction. In summary, we first time propose a method which converts the protein function prediction problem to a language translation problem and applies a neural machine translation model for protein function prediction.
Predicting binding affinities of protein ligands from three-dimensional models: application to peptide binding to class I major histocompatibility proteins

DEFF Research Database (Denmark)

Rognan, D; Lauemoller, S L; Holm, A

1999-01-01

A simple and fast free energy scoring function (Fresno) has been developed to predict the binding free energy of peptides to class I major histocompatibility (MHC) proteins. It differs from existing scoring functions mainly by the explicit treatment of ligand desolvation and of unfavorable protein...... coordinates of the MHC-bound peptide have first been determined with an accuracy of about 1-1.5 A. Furthermore, it may be easily recalibrated for any protein-ligand complex.......) and of a series of 16 peptides to H-2K(k). Predictions were more accurate for HLA-A2-binding peptides as the training set had been built from experimentally determined structures. The average error in predicting the binding free energy of the test peptides was 3.1 kJ/mol. For the homology model-derived equation...
osFP: a web server for predicting the oligomeric states of fluorescent proteins.

Science.gov (United States)

Simeon, Saw; Shoombuatong, Watshara; Anuwongcharoen, Nuttapat; Preeyanon, Likit; Prachayasittikul, Virapong; Wikberg, Jarl E S; Nantasenamat, Chanin

2016-01-01

Currently, monomeric fluorescent proteins (FP) are ideal markers for protein tagging. The prediction of oligomeric states is helpful for enhancing live biomedical imaging. Computational prediction of FP oligomeric states can accelerate the effort of protein engineering efforts of creating monomeric FPs. To the best of our knowledge, this study represents the first computational model for predicting and analyzing FP oligomerization directly from the amino acid sequence. After data curation, an exhaustive data set consisting of 397 non-redundant FP oligomeric states was compiled from the literature. Results from benchmarking of the protein descriptors revealed that the model built with amino acid composition descriptors was the top performing model with accuracy, sensitivity and specificity in excess of 80% and MCC greater than 0.6 for all three data subsets (e.g. training, tenfold cross-validation and external sets). The model provided insights on the important residues governing the oligomerization of FP. To maximize the benefit of the generated predictive model, it was implemented as a web server under the R programming environment. osFP affords a user-friendly interface that can be used to predict the oligomeric state of FP using the protein sequence. The advantage of osFP is that it is platform-independent meaning that it can be accessed via a web browser on any operating system and device. osFP is freely accessible at http://codes.bio/osfp/ while the source code and data set is provided on GitHub at https://github.com/chaninn/osFP/.Graphical Abstract.

Graphical analysis of pH-dependent properties of proteins predicted using PROPKA.

Science.gov (United States)

Rostkowski, Michał; Olsson, Mats H M; Søndergaard, Chresten R; Jensen, Jan H

2011-01-26

Charge states of ionizable residues in proteins determine their pH-dependent properties through their pKa values. Thus, various theoretical methods to determine ionization constants of residues in biological systems have been developed. One of the more widely used approaches for predicting pKa values in proteins is the PROPKA program, which provides convenient structural rationalization of the predicted pKa values without any additional calculations. The PROPKA Graphical User Interface (GUI) is a new tool for studying the pH-dependent properties of proteins such as charge and stabilization energy. It facilitates a quantitative analysis of pKa values of ionizable residues together with their structural determinants by providing a direct link between the pKa data, predicted by the PROPKA calculations, and the structure via the Visual Molecular Dynamics (VMD) program. The GUI also calculates contributions to the pH-dependent unfolding free energy at a given pH for each ionizable group in the protein. Moreover, the PROPKA-computed pKa values or energy contributions of the ionizable residues in question can be displayed interactively. The PROPKA GUI can also be used for comparing pH-dependent properties of more than one structure at the same time. The GUI considerably extends the analysis and validation possibilities of the PROPKA approach. The PROPKA GUI can conveniently be used to investigate ionizable groups, and their interactions, of residues with significantly perturbed pKa values or residues that contribute to the stabilization energy the most. Charge-dependent properties can be studied either for a single protein or simultaneously with other homologous structures, which makes it a helpful tool, for instance, in protein design studies or structure-based function predictions. The GUI is implemented as a Tcl/Tk plug-in for VMD, and can be obtained online at http://propka.ki.ku.dk/~luca/wiki/index.php/GUI_Web.
Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence

DEFF Research Database (Denmark)

Blom, Nikolaj; Sicheritz-Pontén, Thomas; Gupta, Ramneek

2004-01-01

Post-translational modifications (PTMs) occur on almost all proteins analyzed to date. The function of a modified protein is often strongly affected by these modifications and therefore increased knowledge about the potential PTMs of a target protein may increase our understanding of the molecular...... steps by integrating computational approaches into the validation procedures. Many advanced methods for the prediction of PTMs exist and many are made publicly available. We describe our experiences with the development of prediction methods for phosphorylation and glycosylation sites...... and the development of PTM-specific databases. In addition, we discuss novel ideas for PTM visualization (exemplified by kinase landscapes) and improvements for prediction specificity (by using ESS-evolutionary stable sites). As an example, we present a new method for kinase-specific prediction of phosphorylation...
Predicting the binding patterns of hub proteins: a study using yeast protein interaction networks.

Directory of Open Access Journals (Sweden)

Carson M Andorf

Full Text Available Protein-protein interactions are critical to elucidating the role played by individual proteins in important biological pathways. Of particular interest are hub proteins that can interact with large numbers of partners and often play essential roles in cellular control. Depending on the number of binding sites, protein hubs can be classified at a structural level as singlish-interface hubs (SIH with one or two binding sites, or multiple-interface hubs (MIH with three or more binding sites. In terms of kinetics, hub proteins can be classified as date hubs (i.e., interact with different partners at different times or locations or party hubs (i.e., simultaneously interact with multiple partners.Our approach works in 3 phases: Phase I classifies if a protein is likely to bind with another protein. Phase II determines if a protein-binding (PB protein is a hub. Phase III classifies PB proteins as singlish-interface versus multiple-interface hubs and date versus party hubs. At each stage, we use sequence-based predictors trained using several standard machine learning techniques.Our method is able to predict whether a protein is a protein-binding protein with an accuracy of 94% and a correlation coefficient of 0.87; identify hubs from non-hubs with 100% accuracy for 30% of the data; distinguish date hubs/party hubs with 69% accuracy and area under ROC curve of 0.68; and SIH/MIH with 89% accuracy and area under ROC curve of 0.84. Because our method is based on sequence information alone, it can be used even in settings where reliable protein-protein interaction data or structures of protein-protein complexes are unavailable to obtain useful insights into the functional and evolutionary characteristics of proteins and their interactions.We provide a web server for our three-phase approach: http://hybsvm.gdcb.iastate.edu.
Protein requirements for Blue-fronted Amazon (Amazona aestiva) growth.

Science.gov (United States)

Carciofi, A C; Sanfilippo, L F; de-Oliveira, L D; do Amaral, P P; Prada, F

2008-06-01

The objective of this study was to evaluate the protein requirements for hand-rearing Blue-fronted Amazon parrots (Amazona aestiva). Forty hatchlings were fed semi-purified diets containing one of four (as-fed basis) protein levels: 13%, 18%, 23% and 28%. The experiment was carried out in a randomized block design with the initial weight of the nestling as the blocking factor and 10 parrots per protein level. Regression analysis was used to determine relationships between protein level and biometric measurements. The data indicated that 13% crude protein supported nestling growth with 18% being the minimum tested level required for maximum development. The optimal protein concentration for maximum weight gain was 24.4% (p = 0.08; r(2) = 0.25), tail length 23.7% (p = 0.09; r(2) = 0.19), wing length 23.0% (p = 0.07; r(2) = 0.17), tarsus length 21.3% (p = 0.06; r(2) = 0.10) and tarsus width 21.4% (p = 0.07; r(2) = 0.09). Tarsus measurements were larger in males (p < 0.05), indicating that sex must be considered when studying developing psittacines. These results were obtained using a highly digestible protein and a diet with moderate metabolizable energy levels.
3dRPC: a web server for 3D RNA-protein structure prediction.

Science.gov (United States)

Huang, Yangyu; Li, Haotian; Xiao, Yi

2018-04-01

RNA-protein interactions occur in many biological processes. To understand the mechanism of these interactions one needs to know three-dimensional (3D) structures of RNA-protein complexes. 3dRPC is an algorithm for prediction of 3D RNA-protein complex structures and consists of a docking algorithm RPDOCK and a scoring function 3dRPC-Score. RPDOCK is used to sample possible complex conformations of an RNA and a protein by calculating the geometric and electrostatic complementarities and stacking interactions at the RNA-protein interface according to the features of atom packing of the interface. 3dRPC-Score is a knowledge-based potential that uses the conformations of nucleotide-amino-acid pairs as statistical variables and that is used to choose the near-native complex-conformations obtained from the docking method above. Recently, we built a web server for 3dRPC. The users can easily use 3dRPC without installing it locally. RNA and protein structures in PDB (Protein Data Bank) format are the only needed input files. It can also incorporate the information of interface residues or residue-pairs obtained from experiments or theoretical predictions to improve the prediction. The address of 3dRPC web server is http://biophy.hust.edu.cn/3dRPC. yxiao@hust.edu.cn.
A novel Multi-Agent Ada-Boost algorithm for predicting protein structural class with the information of protein secondary structure.

Science.gov (United States)

Fan, Ming; Zheng, Bin; Li, Lihua

2015-10-01

Knowledge of the structural class of a given protein is important for understanding its folding patterns. Although a lot of efforts have been made, it still remains a challenging problem for prediction of protein structural class solely from protein sequences. The feature extraction and classification of proteins are the main problems in prediction. In this research, we extended our earlier work regarding these two aspects. In protein feature extraction, we proposed a scheme by calculating the word frequency and word position from sequences of amino acid, reduced amino acid, and secondary structure. For an accurate classification of the structural class of protein, we developed a novel Multi-Agent Ada-Boost (MA-Ada) method by integrating the features of Multi-Agent system into Ada-Boost algorithm. Extensive experiments were taken to test and compare the proposed method using four benchmark datasets in low homology. The results showed classification accuracies of 88.5%, 96.0%, 88.4%, and 85.5%, respectively, which are much better compared with the existing methods. The source code and dataset are available on request.
Topology and weights in a protein domain interaction network – a novel way to predict protein interactions

Directory of Open Access Journals (Sweden)

Wuchty Stefan

2006-05-01

Full Text Available Abstract Background While the analysis of unweighted biological webs as diverse as genetic, protein and metabolic networks allowed spectacular insights in the inner workings of a cell, biological networks are not only determined by their static grid of links. In fact, we expect that the heterogeneity in the utilization of connections has a major impact on the organization of cellular activities as well. Results We consider a web of interactions between protein domains of the Protein Family database (PFAM, which are weighted by a probability score. We apply metrics that combine the static layout and the weights of the underlying interactions. We observe that unweighted measures as well as their weighted counterparts largely share the same trends in the underlying domain interaction network. However, we only find weak signals that weights and the static grid of interactions are connected entities. Therefore assuming that a protein interaction is governed by a single domain interaction, we observe strong and significant correlations of the highest scoring domain interaction and the confidence of protein interactions in the underlying interactions of yeast and fly. Modeling an interaction between proteins if we find a high scoring protein domain interaction we obtain 1, 428 protein interactions among 361 proteins in the human malaria parasite Plasmodium falciparum. Assessing their quality by a logistic regression method we observe that increasing confidence of predicted interactions is accompanied by high scoring domain interactions and elevated levels of functional similarity and evolutionary conservation. Conclusion Our results indicate that probability scores are randomly distributed, allowing to treat static grid and weights of domain interactions as separate entities. In particular, these finding confirms earlier observations that a protein interaction is a matter of a single interaction event on domain level. As an immediate application, we
Variability in amino acid digestibility and metabolizable energy of corn studied in cecectomized laying hens1.

Science.gov (United States)

Zuber, T; Rodehutscord, M

2017-06-01

To optimize the use of corn grain in diets for laying hens, differences in amino acid (AA) digestibility and metabolizable energy among different corn samples should be considered in feed formulation. The present study investigated the variability of AA digestibility and AMEn concentration of 20 corn samples in cecectomized laying hens. Corn grains were characterized based on their physical properties (thousand seed weight, test weight, grain density, and extract viscoelasticity), chemical composition (proximate nutrients, AA, minerals, and inositol phosphates), gross energy concentration, and in vitro solubility of nitrogen to study any relationship with AA digestibility or AMEn. The animal study comprised 4 Latin squares (6 × 6) distributed between 2 subsequent runs. Cecectomized LSL-Classic hens were individually housed in metabolism cages and fed either a basal diet containing 500 g/kg cornstarch or one of 20 corn diets, each replacing the cornstarch with one corn batch, for 8 days. During the last 4 d, feed intake was recorded and excreta were collected quantitatively. A linear regression approach was used to calculate AA digestibility of the corn. The digestibility of all AA differed significantly between the 20 corn batches, including Lys (digestibility range 64 to 85%), Met (86 to 94%), Thr (72 to 89%), and Trp (21 to 88%). The AMEn of the corn batches ranged between 15.7 and 17.1 MJ/kg DM. However, consistent correlations between AA digestibility or AMEn and the physical and chemical characteristics of the grains were not detected. Equations to predict AA digestibility or AMEn based on the grain's physical and chemical characteristics were calculated by multiple linear regressions. The explanatory power (adjusted R2;) of prediction equations was below 0.6 for the majority of AA and AMEn, and, thus, was not sufficiently precise for practical use. Possible explanations for the variation in AA digestibility and AMEn beyond the determined characteristics
A Physiologically Based Pharmacokinetic Model to Predict the Pharmacokinetics of Highly Protein-Bound Drugs and Impact of Errors in Plasma Protein Binding

Science.gov (United States)

Ye, Min; Nagar, Swati; Korzekwa, Ken

2015-01-01

Predicting the pharmacokinetics of highly protein-bound drugs is difficult. Also, since historical plasma protein binding data was often collected using unbuffered plasma, the resulting inaccurate binding data could contribute to incorrect predictions. This study uses a generic physiologically based pharmacokinetic (PBPK) model to predict human plasma concentration-time profiles for 22 highly protein-bound drugs. Tissue distribution was estimated from in vitro drug lipophilicity data, plasma protein binding, and blood: plasma ratio. Clearance was predicted with a well-stirred liver model. Underestimated hepatic clearance for acidic and neutral compounds was corrected by an empirical scaling factor. Predicted values (pharmacokinetic parameters, plasma concentration-time profile) were compared with observed data to evaluate model accuracy. Of the 22 drugs, less than a 2-fold error was obtained for terminal elimination half-life (t1/2, 100% of drugs), peak plasma concentration (Cmax, 100%), area under the plasma concentration-time curve (AUC0–t, 95.4%), clearance (CLh, 95.4%), mean retention time (MRT, 95.4%), and steady state volume (Vss, 90.9%). The impact of fup errors on CLh and Vss prediction was evaluated. Errors in fup resulted in proportional errors in clearance prediction for low-clearance compounds, and in Vss prediction for high-volume neutral drugs. For high-volume basic drugs, errors in fup did not propagate to errors in Vss prediction. This is due to the cancellation of errors in the calculations for tissue partitioning of basic drugs. Overall, plasma profiles were well simulated with the present PBPK model. PMID:26531057
The predictive nature of transcript expression levels on protein expression in adult human brain.

Science.gov (United States)

Bauernfeind, Amy L; Babbitt, Courtney C

2017-04-24

Next generation sequencing methods are the gold standard for evaluating expression of the transcriptome. When determining the biological implications of such studies, the assumption is often made that transcript expression levels correspond to protein levels in a meaningful way. However, the strength of the overall correlation between transcript and protein expression is inconsistent, particularly in brain samples. Following high-throughput transcriptomic (RNA-Seq) and proteomic (liquid chromatography coupled with tandem mass spectrometry) analyses of adult human brain samples, we compared the correlation in the expression of transcripts and proteins that support various biological processes, molecular functions, and that are located in different areas of the cell. Although most categories of transcripts have extremely weak predictive value for the expression of their associated proteins (R 2 values of < 10%), transcripts coding for protein kinases and membrane-associated proteins, including those that are part of receptors or ion transporters, are among those that are most predictive of downstream protein expression levels. The predictive value of transcript expression for corresponding proteins is variable in human brain samples, reflecting the complex regulation of protein expression. However, we found that transcriptomic analyses are appropriate for assessing the expression levels of certain classes of proteins, including those that modify proteins, such as kinases and phosphatases, regulate metabolic and synaptic activity, or are associated with a cellular membrane. These findings can be used to guide the interpretation of gene expression results from primate brain samples.
Critical assessment of methods of protein structure prediction (CASP)-round IX

KAUST Repository

Moult, John; Fidelis, Krzysztof; Kryshtafovych, Andriy; Tramontano, Anna

2011-01-01

This article is an introduction to the special issue of the journal PROTEINS, dedicated to the ninth Critical Assessment of Structure Prediction (CASP) experiment to assess the state of the art in protein structure modeling. The article describes the conduct of the experiment, the categories of prediction included, and outlines the evaluation and assessment procedures. Methods for modeling protein structure continue to advance, although at a more modest pace than in the early CASP experiments. CASP developments of note are indications of improvement in model accuracy for some classes of target, an improved ability to choose the most accurate of a set of generated models, and evidence of improvement in accuracy for short "new fold" models. In addition, a new analysis of regions of models not derivable from the most obvious template structure has revealed better performance than expected.
NetTurnP – Neural Network Prediction of Beta-turns by Use of Evolutionary Information and Predicted Protein Sequence Features

Science.gov (United States)

Petersen, Bent; Lundegaard, Claus; Petersen, Thomas Nordahl

2010-01-01

β-turns are the most common type of non-repetitive structures, and constitute on average 25% of the amino acids in proteins. The formation of β-turns plays an important role in protein folding, protein stability and molecular recognition processes. In this work we present the neural network method NetTurnP, for prediction of two-class β-turns and prediction of the individual β-turn types, by use of evolutionary information and predicted protein sequence features. It has been evaluated against a commonly used dataset BT426, and achieves a Matthews correlation coefficient of 0.50, which is the highest reported performance on a two-class prediction of β-turn and not-β-turn. Furthermore NetTurnP shows improved performance on some of the specific β-turn types. In the present work, neural network methods have been trained to predict β-turn or not and individual β-turn types from the primary amino acid sequence. The individual β-turn types I, I', II, II', VIII, VIa1, VIa2, VIba and IV have been predicted based on classifications by PROMOTIF, and the two-class prediction of β-turn or not is a superset comprised of all β-turn types. The performance is evaluated using a golden set of non-homologous sequences known as BT426. Our two-class prediction method achieves a performance of: MCC = 0.50, Qtotal = 82.1%, sensitivity = 75.6%, PPV = 68.8% and AUC = 0.864. We have compared our performance to eleven other prediction methods that obtain Matthews correlation coefficients in the range of 0.17 – 0.47. For the type specific β-turn predictions, only type I and II can be predicted with reasonable Matthews correlation coefficients, where we obtain performance values of 0.36 and 0.31, respectively. Conclusion The NetTurnP method has been implemented as a webserver, which is freely available at http://www.cbs.dtu.dk/services/NetTurnP/. NetTurnP is the only available webserver that allows submission of multiple sequences. PMID:21152409
DNCON2: improved protein contact prediction using two-level deep convolutional neural networks.

Science.gov (United States)

Adhikari, Badri; Hou, Jie; Cheng, Jianlin

2018-05-01

Significant improvements in the prediction of protein residue-residue contacts are observed in the recent years. These contacts, predicted using a variety of coevolution-based and machine learning methods, are the key contributors to the recent progress in ab initio protein structure prediction, as demonstrated in the recent CASP experiments. Continuing the development of new methods to reliably predict contact maps is essential to further improve ab initio structure prediction. In this paper we discuss DNCON2, an improved protein contact map predictor based on two-level deep convolutional neural networks. It consists of six convolutional neural networks-the first five predict contacts at 6, 7.5, 8, 8.5 and 10 Å distance thresholds, and the last one uses these five predictions as additional features to predict final contact maps. On the free-modeling datasets in CASP10, 11 and 12 experiments, DNCON2 achieves mean precisions of 35, 50 and 53.4%, respectively, higher than 30.6% by MetaPSICOV on CASP10 dataset, 34% by MetaPSICOV on CASP11 dataset and 46.3% by Raptor-X on CASP12 dataset, when top L/5 long-range contacts are evaluated. We attribute the improved performance of DNCON2 to the inclusion of short- and medium-range contacts into training, two-level approach to prediction, use of the state-of-the-art optimization and activation functions, and a novel deep learning architecture that allows each filter in a convolutional layer to access all the input features of a protein of arbitrary length. The web server of DNCON2 is at http://sysbio.rnet.missouri.edu/dncon2/ where training and testing datasets as well as the predictions for CASP10, 11 and 12 free-modeling datasets can also be downloaded. Its source code is available at https://github.com/multicom-toolbox/DNCON2/. chengji@missouri.edu. Supplementary data are available at Bioinformatics online.
Evaluating a variety of text-mined features for automatic protein function prediction with GOstruct.

Science.gov (United States)

Funk, Christopher S; Kahanda, Indika; Ben-Hur, Asa; Verspoor, Karin M

2015-01-01

Most computational methods that predict protein function do not take advantage of the large amount of information contained in the biomedical literature. In this work we evaluate both ontology term co-mention and bag-of-words features mined from the biomedical literature and analyze their impact in the context of a structured output support vector machine model, GOstruct. We find that even simple literature based features are useful for predicting human protein function (F-max: Molecular Function =0.408, Biological Process =0.461, Cellular Component =0.608). One advantage of using literature features is their ability to offer easy verification of automated predictions. We find through manual inspection of misclassifications that some false positive predictions could be biologically valid predictions based upon support extracted from the literature. Additionally, we present a "medium-throughput" pipeline that was used to annotate a large subset of co-mentions; we suggest that this strategy could help to speed up the rate at which proteins are curated.
Prediction of host - pathogen protein interactions between Mycobacterium tuberculosis and Homo sapiens using sequence motifs.

Science.gov (United States)

Huo, Tong; Liu, Wei; Guo, Yu; Yang, Cheng; Lin, Jianping; Rao, Zihe

2015-03-26

Emergence of multiple drug resistant strains of M. tuberculosis (MDR-TB) threatens to derail global efforts aimed at reigning in the pathogen. Co-infections of M. tuberculosis with HIV are difficult to treat. To counter these new challenges, it is essential to study the interactions between M. tuberculosis and the host to learn how these bacteria cause disease. We report a systematic flow to predict the host pathogen interactions (HPIs) between M. tuberculosis and Homo sapiens based on sequence motifs. First, protein sequences were used as initial input for identifying the HPIs by 'interolog' method. HPIs were further filtered by prediction of domain-domain interactions (DDIs). Functional annotations of protein and publicly available experimental results were applied to filter the remaining HPIs. Using such a strategy, 118 pairs of HPIs were identified, which involve 43 proteins from M. tuberculosis and 48 proteins from Homo sapiens. A biological interaction network between M. tuberculosis and Homo sapiens was then constructed using the predicted inter- and intra-species interactions based on the 118 pairs of HPIs. Finally, a web accessible database named PATH (Protein interactions of M. tuberculosis and Human) was constructed to store these predicted interactions and proteins. This interaction network will facilitate the research on host-pathogen protein-protein interactions, and may throw light on how M. tuberculosis interacts with its host.
DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier.

Science.gov (United States)

Kulmanov, Maxat; Khan, Mohammed Asif; Hoehndorf, Robert; Wren, Jonathan

2018-02-15

A large number of protein sequences are becoming available through the application of novel high-throughput sequencing technologies. Experimental functional characterization of these proteins is time-consuming and expensive, and is often only done rigorously for few selected model organisms. Computational function prediction approaches have been suggested to fill this gap. The functions of proteins are classified using the Gene Ontology (GO), which contains over 40 000 classes. Additionally, proteins have multiple functions, making function prediction a large-scale, multi-class, multi-label problem. We have developed a novel method to predict protein function from sequence. We use deep learning to learn features from protein sequences as well as a cross-species protein-protein interaction network. Our approach specifically outputs information in the structure of the GO and utilizes the dependencies between GO classes as background information to construct a deep learning model. We evaluate our method using the standards established by the Computational Assessment of Function Annotation (CAFA) and demonstrate a significant improvement over baseline methods such as BLAST, in particular for predicting cellular locations. Web server: http://deepgo.bio2vec.net, Source code: https://github.com/bio-ontology-research-group/deepgo. robert.hoehndorf@kaust.edu.sa. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.
Reduced Fragment Diversity for Alpha and Alpha-Beta Protein Structure Prediction using Rosetta.

Science.gov (United States)

Abbass, Jad; Nebel, Jean-Christophe

2017-01-01

Protein structure prediction is considered a main challenge in computational biology. The biannual international competition, Critical Assessment of protein Structure Prediction (CASP), has shown in its eleventh experiment that free modelling target predictions are still beyond reliable accuracy, therefore, much effort should be made to improve ab initio methods. Arguably, Rosetta is considered as the most competitive method when it comes to targets with no homologues. Relying on fragments of length 9 and 3 from known structures, Rosetta creates putative structures by assembling candidate fragments. Generally, the structure with the lowest energy score, also known as first model, is chosen to be the "predicted one". A thorough study has been conducted on the role and diversity of 3-mers involved in Rosetta's model "refinement" phase. Usage of the standard number of 3-mers - i.e. 200 - has been shown to degrade alpha and alpha-beta protein conformations initially achieved by assembling 9-mers. Therefore, a new prediction pipeline is proposed for Rosetta where the "refinement" phase is customised according to a target's structural class prediction. Over 8% improvement in terms of first model structure accuracy is reported for alpha and alpha-beta classes when decreasing the number of 3- mers. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
PFP: Automated prediction of gene ontology functional annotations with confidence scores using protein sequence data.

Science.gov (United States)

Hawkins, Troy; Chitale, Meghana; Luban, Stanislav; Kihara, Daisuke

2009-02-15

Protein function prediction is a central problem in bioinformatics, increasing in importance recently due to the rapid accumulation of biological data awaiting interpretation. Sequence data represents the bulk of this new stock and is the obvious target for consideration as input, as newly sequenced organisms often lack any other type of biological characterization. We have previously introduced PFP (Protein Function Prediction) as our sequence-based predictor of Gene Ontology (GO) functional terms. PFP interprets the results of a PSI-BLAST search by extracting and scoring individual functional attributes, searching a wide range of E-value sequence matches, and utilizing conventional data mining techniques to fill in missing information. We have shown it to be effective in predicting both specific and low-resolution functional attributes when sufficient data is unavailable. Here we describe (1) significant improvements to the PFP infrastructure, including the addition of prediction significance and confidence scores, (2) a thorough benchmark of performance and comparisons to other related prediction methods, and (3) applications of PFP predictions to genome-scale data. We applied PFP predictions to uncharacterized protein sequences from 15 organisms. Among these sequences, 60-90% could be annotated with a GO molecular function term at high confidence (>or=80%). We also applied our predictions to the protein-protein interaction network of the Malaria plasmodium (Plasmodium falciparum). High confidence GO biological process predictions (>or=90%) from PFP increased the number of fully enriched interactions in this dataset from 23% of interactions to 94%. Our benchmark comparison shows significant performance improvement of PFP relative to GOtcha, InterProScan, and PSI-BLAST predictions. This is consistent with the performance of PFP as the overall best predictor in both the AFP-SIG '05 and CASP7 function (FN) assessments. PFP is available as a web service at http
Predicting adverse drug reaction profiles by integrating protein interaction networks with drug structures.

Science.gov (United States)

Huang, Liang-Chin; Wu, Xiaogang; Chen, Jake Y

2013-01-01

The prediction of adverse drug reactions (ADRs) has become increasingly important, due to the rising concern on serious ADRs that can cause drugs to fail to reach or stay in the market. We proposed a framework for predicting ADR profiles by integrating protein-protein interaction (PPI) networks with drug structures. We compared ADR prediction performances over 18 ADR categories through four feature groups-only drug targets, drug targets with PPI networks, drug structures, and drug targets with PPI networks plus drug structures. The results showed that the integration of PPI networks and drug structures can significantly improve the ADR prediction performance. The median AUC values for the four groups were 0.59, 0.61, 0.65, and 0.70. We used the protein features in the best two models, "Cardiac disorders" (median-AUC: 0.82) and "Psychiatric disorders" (median-AUC: 0.76), to build ADR-specific PPI networks with literature supports. For validation, we examined 30 drugs withdrawn from the U.S. market to see if our approach can predict their ADR profiles and explain why they were withdrawn. Except for three drugs having ADRs in the categories we did not predict, 25 out of 27 withdrawn drugs (92.6%) having severe ADRs were successfully predicted by our approach. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Prediction of endoplasmic reticulum resident proteins using fragmented amino acid composition and support vector machine

Directory of Open Access Journals (Sweden)

Ravindra Kumar

2017-09-01

Full Text Available Background The endoplasmic reticulum plays an important role in many cellular processes, which includes protein synthesis, folding and post-translational processing of newly synthesized proteins. It is also the site for quality control of misfolded proteins and entry point of extracellular proteins to the secretory pathway. Hence at any given point of time, endoplasmic reticulum contains two different cohorts of proteins, (i proteins involved in endoplasmic reticulum-specific function, which reside in the lumen of the endoplasmic reticulum, called as endoplasmic reticulum resident proteins and (ii proteins which are in process of moving to the extracellular space. Thus, endoplasmic reticulum resident proteins must somehow be distinguished from newly synthesized secretory proteins, which pass through the endoplasmic reticulum on their way out of the cell. Approximately only 50% of the proteins used in this study as training data had endoplasmic reticulum retention signal, which shows that these signals are not essentially present in all endoplasmic reticulum resident proteins. This also strongly indicates the role of additional factors in retention of endoplasmic reticulum-specific proteins inside the endoplasmic reticulum. Methods This is a support vector machine based method, where we had used different forms of protein features as inputs for support vector machine to develop the prediction models. During training leave-one-out approach of cross-validation was used. Maximum performance was obtained with a combination of amino acid compositions of different part of proteins. Results In this study, we have reported a novel support vector machine based method for predicting endoplasmic reticulum resident proteins, named as ERPred. During training we achieved a maximum accuracy of 81.42% with leave-one-out approach of cross-validation. When evaluated on independent dataset, ERPred did prediction with sensitivity of 72.31% and specificity of 83

Predicting highly-connected hubs in protein interaction networks by QSAR and biological data descriptors

Science.gov (United States)

Hsing, Michael; Byler, Kendall; Cherkasov, Artem

2009-01-01

Hub proteins (those engaged in most physical interactions in a protein interaction network (PIN) have recently gained much research interest due to their essential role in mediating cellular processes and their potential therapeutic value. It is straightforward to identify hubs if the underlying PIN is experimentally determined; however, theoretical hub prediction remains a very challenging task, as physicochemical properties that differentiate hubs from less connected proteins remain mostly uncharacterized. To adequately distinguish hubs from non-hub proteins we have utilized over 1300 protein descriptors, some of which represent QSAR (quantitative structure-activity relationship) parameters, and some reflect sequence-derived characteristics of proteins including domain composition and functional annotations. Those protein descriptors, together with available protein interaction data have been processed by a machine learning method (boosting trees) and resulted in the development of hub classifiers that are capable of predicting highly interacting proteins for four model organisms: Escherichia coli, Saccharomyces cerevisiae, Drosophila melanogaster and Homo sapiens. More importantly, through the analyses of the most relevant protein descriptors, we are able to demonstrate that hub proteins not only share certain common physicochemical and structural characteristics that make them different from non-hub counterparts, but they also exhibit species-specific characteristics that should be taken into account when analyzing different PINs. The developed prediction models can be used for determining highly interacting proteins in the four studied species to assist future proteomics experiments and PIN analyses. Availability The source code and executable program of the hub classifier are available for download at: http://www.cnbi2.ca/hub-analysis/ PMID:20198194
Analysis of deep learning methods for blind protein contact prediction in CASP12.

Science.gov (United States)

Wang, Sheng; Sun, Siqi; Xu, Jinbo

2018-03-01

Here we present the results of protein contact prediction achieved in CASP12 by our RaptorX-Contact server, which is an early implementation of our deep learning method for contact prediction. On a set of 38 free-modeling target domains with a median family size of around 58 effective sequences, our server obtained an average top L/5 long- and medium-range contact accuracy of 47% and 44%, respectively (L = length). A complete implementation has an average accuracy of 59% and 57%, respectively. Our deep learning method formulates contact prediction as a pixel-level image labeling problem and simultaneously predicts all residue pairs of a protein using a combination of two deep residual neural networks, taking as input the residue conservation information, predicted secondary structure and solvent accessibility, contact potential, and coevolution information. Our approach differs from existing methods mainly in (1) formulating contact prediction as a pixel-level image labeling problem instead of an image-level classification problem; (2) simultaneously predicting all contacts of an individual protein to make effective use of contact occurrence patterns; and (3) integrating both one-dimensional and two-dimensional deep convolutional neural networks to effectively learn complex sequence-structure relationship including high-order residue correlation. This paper discusses the RaptorX-Contact pipeline, both contact prediction and contact-based folding results, and finally the strength and weakness of our method. © 2017 Wiley Periodicals, Inc.
Effects of shortening the close-up period length coupled with increased supply of metabolizable protein on performance and metabolic status of multiparous Holstein cows.

Science.gov (United States)

Farahani, T Amirabadi; Amanlou, H; Kazemi-Bonchenari, M

2017-08-01

This experiment was conducted to compare conventional (CON; 21 d) and shortened (SH; 10 d) close-up period, and evaluate the effect of shortened close-up period combined with feeding different metabolizable protein (MP) levels on dry matter (DM) intake, metabolic status, and performance of dairy cows. Forty-eight multiparous Holstein cows with similar parity, body weight (BW), and previous lactation milk yield were divided into 2 groups. The first group (n = 24) received the far-off diet from -60 to -21 d (CON), and the second group (n = 24) received same far-off diet from -60 to -10 d (SH) relative to expected parturition. Cows were then moved to individual stalls and randomly allocated to 1 of 3 close-up diets: low MP diet (LMP; MP = 79 g/kg of DM), medium MP diet (MMP; MP = 101 g/kg of DM), or high MP diet (HMP; MP = 118 g/kg of DM). Treatments were used in a 2 × 3 factorial arrangement with 2 lengths of close-up period (CON and SH) and 3 levels of MP (LMP, MMP, and HMP). All diets were fed for ad libitum intake during the close-up period. After calving, all cows received the same fresh cow diet. We found no interaction between close-up period length and MP levels for traits, except for postpartum serum fatty acids and β-hydroxybutyrate (BHB). The concentrations of postpartum serum fatty acids and BHB were higher on LMP than MMP and HMP diets in SH group. The cows of the SH group tended to produce less colostrum in the first milking than cows in CON group. The length of close-up period did not affect pre- and postpartum DM intake or energy balance of cows during the last week of prepartum, but cows of the CON group had greater BW changes during the last 3 wk before parturition than cows in SH group. Cows fed MMP and HMP diets consumed 1.2 and 1 kg more DM than for those fed LMP prepartum, respectively. The concentrations of prepartum BHB and Ca were higher for SH cows than CON group cows. Except for blood urea N concentration, no other blood metabolite in
Structural similarity-based predictions of protein interactions between HIV-1 and Homo sapiens

Directory of Open Access Journals (Sweden)

Gomez Shawn M

2010-04-01

Full Text Available Abstract Background In the course of infection, viruses such as HIV-1 must enter a cell, travel to sites where they can hijack host machinery to transcribe their genes and translate their proteins, assemble, and then leave the cell again, all while evading the host immune system. Thus, successful infection depends on the pathogen's ability to manipulate the biological pathways and processes of the organism it infects. Interactions between HIV-encoded and human proteins provide one means by which HIV-1 can connect into cellular pathways to carry out these survival processes. Results We developed and applied a computational approach to predict interactions between HIV and human proteins based on structural similarity of 9 HIV-1 proteins to human proteins having known interactions. Using functional data from RNAi studies as a filter, we generated over 2000 interaction predictions between HIV proteins and 406 unique human proteins. Additional filtering based on Gene Ontology cellular component annotation reduced the number of predictions to 502 interactions involving 137 human proteins. We find numerous known interactions as well as novel interactions showing significant functional relevance based on supporting Gene Ontology and literature evidence. Conclusions Understanding the interplay between HIV-1 and its human host will help in understanding the viral lifecycle and the ways in which this virus is able to manipulate its host. The results shown here provide a potential set of interactions that are amenable to further experimental manipulation as well as potential targets for therapeutic intervention.
Artificial Intelligence in Prediction of Secondary Protein Structure Using CB513 Database

Science.gov (United States)

Avdagic, Zikrija; Purisevic, Elvir; Omanovic, Samir; Coralic, Zlatan

2009-01-01

In this paper we describe CB513 a non-redundant dataset, suitable for development of algorithms for prediction of secondary protein structure. A program was made in Borland Delphi for transforming data from our dataset to make it suitable for learning of neural network for prediction of secondary protein structure implemented in MATLAB Neural-Network Toolbox. Learning (training and testing) of neural network is researched with different sizes of windows, different number of neurons in the hidden layer and different number of training epochs, while using dataset CB513. PMID:21347158
RaptorX-Property: a web server for protein structure property prediction.

Science.gov (United States)

Wang, Sheng; Li, Wei; Liu, Shiwang; Xu, Jinbo

2016-07-08

RaptorX Property (http://raptorx2.uchicago.edu/StructurePropertyPred/predict/) is a web server predicting structure property of a protein sequence without using any templates. It outperforms other servers, especially for proteins without close homologs in PDB or with very sparse sequence profile (i.e. carries little evolutionary information). This server employs a powerful in-house deep learning model DeepCNF (Deep Convolutional Neural Fields) to predict secondary structure (SS), solvent accessibility (ACC) and disorder regions (DISO). DeepCNF not only models complex sequence-structure relationship by a deep hierarchical architecture, but also interdependency between adjacent property labels. Our experimental results show that, tested on CASP10, CASP11 and the other benchmarks, this server can obtain ∼84% Q3 accuracy for 3-state SS, ∼72% Q8 accuracy for 8-state SS, ∼66% Q3 accuracy for 3-state solvent accessibility, and ∼0.89 area under the ROC curve (AUC) for disorder prediction. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Predicting protein folding pathways at the mesoscopic level based on native interactions between secondary structure elements

Directory of Open Access Journals (Sweden)

Sze Sing-Hoi

2008-07-01

Full Text Available Abstract Background Since experimental determination of protein folding pathways remains difficult, computational techniques are often used to simulate protein folding. Most current techniques to predict protein folding pathways are computationally intensive and are suitable only for small proteins. Results By assuming that the native structure of a protein is known and representing each intermediate conformation as a collection of fully folded structures in which each of them contains a set of interacting secondary structure elements, we show that it is possible to significantly reduce the conformation space while still being able to predict the most energetically favorable folding pathway of large proteins with hundreds of residues at the mesoscopic level, including the pig muscle phosphoglycerate kinase with 416 residues. The model is detailed enough to distinguish between different folding pathways of structurally very similar proteins, including the streptococcal protein G and the peptostreptococcal protein L. The model is also able to recognize the differences between the folding pathways of protein G and its two structurally similar variants NuG1 and NuG2, which are even harder to distinguish. We show that this strategy can produce accurate predictions on many other proteins with experimentally determined intermediate folding states. Conclusion Our technique is efficient enough to predict folding pathways for both large and small proteins at the mesoscopic level. Such a strategy is often the only feasible choice for large proteins. A software program implementing this strategy (SSFold is available at http://faculty.cs.tamu.edu/shsze/ssfold.
MFPred: Rapid and accurate prediction of protein-peptide recognition multispecificity using self-consistent mean field theory.

Directory of Open Access Journals (Sweden)

Aliza B Rubenstein

2017-06-01

Full Text Available Multispecificity-the ability of a single receptor protein molecule to interact with multiple substrates-is a hallmark of molecular recognition at protein-protein and protein-peptide interfaces, including enzyme-substrate complexes. The ability to perform structure-based prediction of multispecificity would aid in the identification of novel enzyme substrates, protein interaction partners, and enable design of novel enzymes targeted towards alternative substrates. The relatively slow speed of current biophysical, structure-based methods limits their use for prediction and, especially, design of multispecificity. Here, we develop a rapid, flexible-backbone self-consistent mean field theory-based technique, MFPred, for multispecificity modeling at protein-peptide interfaces. We benchmark our method by predicting experimentally determined peptide specificity profiles for a range of receptors: protease and kinase enzymes, and protein recognition modules including SH2, SH3, MHC Class I and PDZ domains. We observe robust recapitulation of known specificities for all receptor-peptide complexes, and comparison with other methods shows that MFPred results in equivalent or better prediction accuracy with a ~10-1000-fold decrease in computational expense. We find that modeling bound peptide backbone flexibility is key to the observed accuracy of the method. We used MFPred for predicting with high accuracy the impact of receptor-side mutations on experimentally determined multispecificity of a protease enzyme. Our approach should enable the design of a wide range of altered receptor proteins with programmed multispecificities.
TRUE METABOLIZABLE ENERGY AND DIGESTIBILITY OF FIVE Vigna unguiculata VARIETIES IN CHICKENS

Directory of Open Access Journals (Sweden)

Luis Armando Sarmiento-Franco

2010-11-01

Full Text Available This study was carried out to evaluate the effect of heat-treatment on grain true metabolizable energy (TME, dry matter and gross energy digestibilities of five Vigna unguiculata varieties: H82, T782, TM97, C666 y XL. The grain of the former three varieties were heat-treated, and offered raw or cooked, whereas grain of the late two varieties were used only row, resulting in a total of eight treatments. The heat treatment consisted of watering the grains with boiling water for 30 minutes and drying at 60Â°C. Â Forty-five Hubbard male chickens (2.1 Â± 0.2 kg housed in individual wire pens were used to evaluate the treatments. Five chickens from each treatment were fed 40 g of treated grain in mash form, using the force-feeding technique. Additionally, five fasted chickens were used to calculate the endogenous energy and DM losses. The data were submitted to an analysis of variance according to the randomized statistical model; to evaluate the effect of heat treatment orthogonal contrasts were performed. There were no significant differences in all the variables neither among varieties nor between heat treatments (P>0.05. TME values in this study were similar to those found in the literature and equivalent to the TME value of soybean meal, a conventional feedstuff used in the poultry industry.
Improving predictions of protein-protein interfaces by combining amino acid-specific classifiers based on structural and physicochemical descriptors with their weighted neighbor averages.

Directory of Open Access Journals (Sweden)

Fábio R de Moraes

Full Text Available Protein-protein interactions are involved in nearly all regulatory processes in the cell and are considered one of the most important issues in molecular biology and pharmaceutical sciences but are still not fully understood. Structural and computational biology contributed greatly to the elucidation of the mechanism of protein interactions. In this paper, we present a collection of the physicochemical and structural characteristics that distinguish interface-forming residues (IFR from free surface residues (FSR. We formulated a linear discriminative analysis (LDA classifier to assess whether chosen descriptors from the BlueStar STING database (http://www.cbi.cnptia.embrapa.br/SMS/ are suitable for such a task. Receiver operating characteristic (ROC analysis indicates that the particular physicochemical and structural descriptors used for building the linear classifier perform much better than a random classifier and in fact, successfully outperform some of the previously published procedures, whose performance indicators were recently compared by other research groups. The results presented here show that the selected set of descriptors can be utilized to predict IFRs, even when homologue proteins are missing (particularly important for orphan proteins where no homologue is available for comparative analysis/indication or, when certain conformational changes accompany interface formation. The development of amino acid type specific classifiers is shown to increase IFR classification performance. Also, we found that the addition of an amino acid conservation attribute did not improve the classification prediction. This result indicates that the increase in predictive power associated with amino acid conservation is exhausted by adequate use of an extensive list of independent physicochemical and structural parameters that, by themselves, fully describe the nano-environment at protein-protein interfaces. The IFR classifier developed in this study
Improving predictions of protein-protein interfaces by combining amino acid-specific classifiers based on structural and physicochemical descriptors with their weighted neighbor averages.

Science.gov (United States)

de Moraes, Fábio R; Neshich, Izabella A P; Mazoni, Ivan; Yano, Inácio H; Pereira, José G C; Salim, José A; Jardine, José G; Neshich, Goran

2014-01-01

Protein-protein interactions are involved in nearly all regulatory processes in the cell and are considered one of the most important issues in molecular biology and pharmaceutical sciences but are still not fully understood. Structural and computational biology contributed greatly to the elucidation of the mechanism of protein interactions. In this paper, we present a collection of the physicochemical and structural characteristics that distinguish interface-forming residues (IFR) from free surface residues (FSR). We formulated a linear discriminative analysis (LDA) classifier to assess whether chosen descriptors from the BlueStar STING database (http://www.cbi.cnptia.embrapa.br/SMS/) are suitable for such a task. Receiver operating characteristic (ROC) analysis indicates that the particular physicochemical and structural descriptors used for building the linear classifier perform much better than a random classifier and in fact, successfully outperform some of the previously published procedures, whose performance indicators were recently compared by other research groups. The results presented here show that the selected set of descriptors can be utilized to predict IFRs, even when homologue proteins are missing (particularly important for orphan proteins where no homologue is available for comparative analysis/indication) or, when certain conformational changes accompany interface formation. The development of amino acid type specific classifiers is shown to increase IFR classification performance. Also, we found that the addition of an amino acid conservation attribute did not improve the classification prediction. This result indicates that the increase in predictive power associated with amino acid conservation is exhausted by adequate use of an extensive list of independent physicochemical and structural parameters that, by themselves, fully describe the nano-environment at protein-protein interfaces. The IFR classifier developed in this study is now
Improving Predictions of Protein-Protein Interfaces by Combining Amino Acid-Specific Classifiers Based on Structural and Physicochemical Descriptors with Their Weighted Neighbor Averages

Science.gov (United States)

de Moraes, Fábio R.; Neshich, Izabella A. P.; Mazoni, Ivan; Yano, Inácio H.; Pereira, José G. C.; Salim, José A.; Jardine, José G.; Neshich, Goran

2014-01-01

Protein-protein interactions are involved in nearly all regulatory processes in the cell and are considered one of the most important issues in molecular biology and pharmaceutical sciences but are still not fully understood. Structural and computational biology contributed greatly to the elucidation of the mechanism of protein interactions. In this paper, we present a collection of the physicochemical and structural characteristics that distinguish interface-forming residues (IFR) from free surface residues (FSR). We formulated a linear discriminative analysis (LDA) classifier to assess whether chosen descriptors from the BlueStar STING database (http://www.cbi.cnptia.embrapa.br/SMS/) are suitable for such a task. Receiver operating characteristic (ROC) analysis indicates that the particular physicochemical and structural descriptors used for building the linear classifier perform much better than a random classifier and in fact, successfully outperform some of the previously published procedures, whose performance indicators were recently compared by other research groups. The results presented here show that the selected set of descriptors can be utilized to predict IFRs, even when homologue proteins are missing (particularly important for orphan proteins where no homologue is available for comparative analysis/indication) or, when certain conformational changes accompany interface formation. The development of amino acid type specific classifiers is shown to increase IFR classification performance. Also, we found that the addition of an amino acid conservation attribute did not improve the classification prediction. This result indicates that the increase in predictive power associated with amino acid conservation is exhausted by adequate use of an extensive list of independent physicochemical and structural parameters that, by themselves, fully describe the nano-environment at protein-protein interfaces. The IFR classifier developed in this study is now
CNNH_PSS: protein 8-class secondary structure prediction by convolutional neural network with highway.

Science.gov (United States)

Zhou, Jiyun; Wang, Hongpeng; Zhao, Zhishan; Xu, Ruifeng; Lu, Qin

2018-05-08

Protein secondary structure is the three dimensional form of local segments of proteins and its prediction is an important problem in protein tertiary structure prediction. Developing computational approaches for protein secondary structure prediction is becoming increasingly urgent. We present a novel deep learning based model, referred to as CNNH_PSS, by using multi-scale CNN with highway. In CNNH_PSS, any two neighbor convolutional layers have a highway to deliver information from current layer to the output of the next one to keep local contexts. As lower layers extract local context while higher layers extract long-range interdependencies, the highways between neighbor layers allow CNNH_PSS to have ability to extract both local contexts and long-range interdependencies. We evaluate CNNH_PSS on two commonly used datasets: CB6133 and CB513. CNNH_PSS outperforms the multi-scale CNN without highway by at least 0.010 Q8 accuracy and also performs better than CNF, DeepCNF and SSpro8, which cannot extract long-range interdependencies, by at least 0.020 Q8 accuracy, demonstrating that both local contexts and long-range interdependencies are indeed useful for prediction. Furthermore, CNNH_PSS also performs better than GSM and DCRNN which need extra complex model to extract long-range interdependencies. It demonstrates that CNNH_PSS not only cost less computer resource, but also achieves better predicting performance. CNNH_PSS have ability to extracts both local contexts and long-range interdependencies by combing multi-scale CNN and highway network. The evaluations on common datasets and comparisons with state-of-the-art methods indicate that CNNH_PSS is an useful and efficient tool for protein secondary structure prediction.
Computational tools for experimental determination and theoretical prediction of protein structure

Energy Technology Data Exchange (ETDEWEB)

O`Donoghue, S.; Rost, B.

1995-12-31

This tutorial was one of eight tutorials selected to be presented at the Third International Conference on Intelligent Systems for Molecular Biology which was held in the United Kingdom from July 16 to 19, 1995. The authors intend to review the state of the art in the experimental determination of protein 3D structure (focus on nuclear magnetic resonance), and in the theoretical prediction of protein function and of protein structure in 1D, 2D and 3D from sequence. All the atomic resolution structures determined so far have been derived from either X-ray crystallography (the majority so far) or Nuclear Magnetic Resonance (NMR) Spectroscopy (becoming increasingly more important). The authors briefly describe the physical methods behind both of these techniques; the major computational methods involved will be covered in some detail. They highlight parallels and differences between the methods, and also the current limitations. Special emphasis will be given to techniques which have application to ab initio structure prediction. Large scale sequencing techniques increase the gap between the number of known proteins sequences and that of known protein structures. They describe the scope and principles of methods that contribute successfully to closing that gap. Emphasis will be given on the specification of adequate testing procedures to validate such methods.
Prediction of Protein Structural Classes for Low-Similarity Sequences Based on Consensus Sequence and Segmented PSSM

Directory of Open Access Journals (Sweden)

Yunyun Liang

2015-01-01

Full Text Available Prediction of protein structural classes for low-similarity sequences is useful for understanding fold patterns, regulation, functions, and interactions of proteins. It is well known that feature extraction is significant to prediction of protein structural class and it mainly uses protein primary sequence, predicted secondary structure sequence, and position-specific scoring matrix (PSSM. Currently, prediction solely based on the PSSM has played a key role in improving the prediction accuracy. In this paper, we propose a novel method called CSP-SegPseP-SegACP by fusing consensus sequence (CS, segmented PsePSSM, and segmented autocovariance transformation (ACT based on PSSM. Three widely used low-similarity datasets (1189, 25PDB, and 640 are adopted in this paper. Then a 700-dimensional (700D feature vector is constructed and the dimension is decreased to 224D by using principal component analysis (PCA. To verify the performance of our method, rigorous jackknife cross-validation tests are performed on 1189, 25PDB, and 640 datasets. Comparison of our results with the existing PSSM-based methods demonstrates that our method achieves the favorable and competitive performance. This will offer an important complementary to other PSSM-based methods for prediction of protein structural classes for low-similarity sequences.
Valor nutricional e energia metabolizável de subprodutos do trigo utilizados para alimentação de suínos em crescimento

OpenAIRE

Wesendonck,William Rui; Kessler,Alexandre de Mello; Ribeiro,Andréa Machado Leal; Somensi,Marcelo Luiz; Bockor,Luciane; Dadalt,Julio Cezar; Monteiro,Alessandra Nardina Trícia Rigo; Marx,Fábio Ritter

2013-01-01

O objetivo deste trabalho foi avaliar o valor nutricional e energético de subprodutos do trigo, em dietas para suínos em crescimento, e obter equações de predição da energia metabolizável. Foram utilizados 36 suínos machos, castrados, alojados em gaiolas metabólicas individuais. Realizou-se a coleta total de fezes e urina em dois períodos de dez dias: cinco para adaptação e cinco para coleta. Utilizou-se o delineamento de blocos ao acaso, tendo-se considerado o período de coleta como bloco, c...
Prediction of homoprotein and heteroprotein complexes by protein docking and template-based modeling : A CASP-CAPRI experiment

NARCIS (Netherlands)

Lensink, Marc F.; Velankar, Sameer; Kryshtafovych, Andriy; Huang, Shen You; Schneidman-Duhovny, Dina; Sali, Andrej; Segura, Joan; Fernandez-Fuentes, Narcis; Viswanath, Shruthi; Elber, Ron; Grudinin, Sergei; Popov, Petr; Neveu, Emilie; Lee, Hasup; Baek, Minkyung; Park, Sangwoo; Heo, Lim; Rie Lee, Gyu; Seok, Chaok; Qin, Sanbo; Zhou, Huan Xiang; Ritchie, David W.; Maigret, Bernard; Devignes, Marie Dominique; Ghoorah, Anisah; Torchala, Mieczyslaw; Chaleil, Raphaël A G; Bates, Paul A.; Ben-Zeev, Efrat; Eisenstein, Miriam; Negi, Surendra S.; Weng, Zhiping; Vreven, Thom; Pierce, Brian G.; Borrman, Tyler M.; Yu, Jinchao; Ochsenbein, Françoise; Guerois, Raphaël; Vangone, Anna; Garcia Lopes Maia Rodrigues, João; van Zundert, Gydo; Nellen, Mehdi; Xue, Li; Karaca, Ezgi; Melquiond, Adrien S J; Visscher, Koen; Kastritis, Panagiotis L.; Bonvin, Alexandre M J J; Xu, Xianjin; Qiu, Liming; Yan, Chengfei; Li, Jilong; Ma, Zhiwei; Cheng, Jianlin; Zou, Xiaoqin; Shen, Yang; Peterson, Lenna X.; Kim, Hyung Rae; Roy, Amit; Han, Xusi; Esquivel-Rodriguez, Juan; Kihara, Daisuke; Yu, Xiaofeng; Bruce, Neil J.; Fuller, Jonathan C.; Wade, Rebecca C.; Anishchenko, Ivan; Kundrotas, Petras J.; Vakser, Ilya A.; Imai, Kenichiro; Yamada, Kazunori; Oda, Toshiyuki; Nakamura, Tsukasa; Tomii, Kentaro; Pallara, Chiara; Romero-Durana, Miguel; Jiménez-García, Brian; Moal, Iain H.; Férnandez-Recio, Juan; Joung, Jong Young; Kim, Jong Yun; Joo, Keehyoung; Lee, Jooyoung; Kozakov, Dima; Vajda, Sandor; Mottarella, Scott; Hall, David R.; Beglov, Dmitri; Mamonov, Artem; Xia, Bing; Bohnuud, Tanggis; Del Carpio, Carlos A.; Ichiishi, Eichiro; Marze, Nicholas; Kuroda, Daisuke; Roy Burman, Shourya S.; Gray, Jeffrey J.; Chermak, Edrisse; Cavallo, Luigi; Oliva, Romina; Tovchigrechko, Andrey; Wodak, Shoshana J.

2016-01-01

We present the results for CAPRI Round 30, the first joint CASP-CAPRI experiment, which brought together experts from the protein structure prediction and protein-protein docking communities. The Round comprised 25 targets from amongst those submitted for the CASP11 prediction experiment of 2014.
Prediction of homoprotein and heteroprotein complexes by protein docking and template-based modeling: A CASP-CAPRI experiment

NARCIS (Netherlands)

Lensink, Marc F.; Velankar, Sameer; Kryshtafovych, Andriy; Huang, Shen You; Schneidman-Duhovny, Dina; Sali, Andrej; Segura, Joan; Fernandez-Fuentes, Narcis; Viswanath, Shruthi; Elber, Ron; Grudinin, Sergei; Popov, Petr; Neveu, Emilie; Lee, Hasup; Baek, Minkyung; Park, Sangwoo; Heo, Lim; Lee, Gyu Rie; Seok, Chaok; Qin, Sanbo; Zhou, Huan Xiang; Ritchie, David W.; Maigret, Bernard; Devignes, Marie Dominique; Ghoorah, Anisah; Torchala, Mieczyslaw; Chaleil, Raphaël A.G.; Bates, Paul A.; Ben-Zeev, Efrat; Eisenstein, Miriam; Negi, Surendra S.; Weng, Zhiping; Vreven, Thom; Pierce, Brian G.; Borrman, Tyler M.; Yu, Jinchao; Ochsenbein, Françoise; Guerois, Raphaël; Vangone, Anna; Rodrigues, João P.G.L.M.; Van Zundert, Gydo; Nellen, Mehdi; Xue, Li; Karaca, Ezgi; Melquiond, Adrien S.J.; Visscher, Koen; Kastritis, Panagiotis L.; Bonvin, Alexandre M.J.J.; Xu, Xianjin; Qiu, Liming; Yan, Chengfei; Li, Jilong; Ma, Zhiwei; Cheng, Jianlin; Zou, Xiaoqin; Shen, Yang; Peterson, Lenna X.; Kim, Hyung Rae; Roy, Amit; Han, Xusi; Esquivel-Rodriguez, Juan; Kihara, Daisuke; Yu, Xiaofeng; Bruce, Neil J.; Fuller, Jonathan C.; Wade, Rebecca C.; Anishchenko, Ivan; Kundrotas, Petras J.; Vakser, Ilya A.; Imai, Kenichiro; Yamada, Kazunori; Oda, Toshiyuki; Nakamura, Tsukasa; Tomii, Kentaro; Pallara, Chiara; Romero-Durana, Miguel; Jiménez-García, Brian; Moal, Iain H.; Férnandez-Recio, Juan; Joung, Jong Young; Kim, Jong Yun; Joo, Keehyoung; Lee, Jooyoung; Kozakov, Dima; Vajda, Sandor; Mottarella, Scott; Hall, David R.; Beglov, Dmitri; Mamonov, Artem; Xia, Bing; Bohnuud, Tanggis; Del Carpio, Carlos A.; Ichiishi, Eichiro; Marze, Nicholas; Kuroda, Daisuke; Roy Burman, Shourya S.; Gray, Jeffrey J.; Chermak, Edrisse; Cavallo, Luigi; Oliva, Romina; Tovchigrechko, Andrey; Wodak, Shoshana J.

2016-01-01

We present the results for CAPRI Round 30, the first joint CASP-CAPRI experiment, which brought together experts from the protein structure prediction and protein-protein docking communities. The Round comprised 25 targets from amongst those submitted for the CASP11 prediction experiment of 2014.
Computational Approaches for Prediction of Pathogen-Host Protein-Protein Interactions

Directory of Open Access Journals (Sweden)

Esmaeil eNourani

2015-02-01

Full Text Available Infectious diseases are still among the major and prevalent health problems, mostly because of the drug resistance of novel variants of pathogens. Molecular interactions between pathogens and their hosts are the key part of the infection mechanisms. Novel antimicrobial therapeutics to fight drug resistance is only possible in case of a thorough understanding of pathogen-host interaction (PHI systems. Existing databases, which contain experimentally verified PHI data, suffer from scarcity of reported interactions due to the technically challenging and time consuming process of experiments. This has motivated many researchers to address the problem by proposing computational approaches for analysis and prediction of PHIs. The computational methods primarily utilize sequence information, protein structure and known interactions. Classic machine learning techniques are used when there are sufficient known interactions to be used as training data. On the opposite case, transfer and multi task learning methods are preferred. Here, we present an overview of these computational approaches for PHI prediction, discussing their weakness and abilities, with future directions.
Protein structure predictions with Monte Carlo simulated annealing: Case for the β-sheet

Science.gov (United States)

Okamoto, Y.; Fukugita, M.; Kawai, H.; Nakazawa, T.

Work is continued for a prediction of three-dimensional structure of peptides and proteins with Monte Carlo simulated annealing using only a generic energy function and amino acid sequence as input. We report that β-sheet like structure is successfully predicted for a fragment of bovine pancreatic trypsin inhibitor which is known to have the β-sheet structure in nature. Together with the results for α-helix structure reported earlier, this means that a successful prediction can be made, at least at a qualitative level, for two dominant building blocks of proteins, α-helix and β-sheet, from the information of amino acid sequence alone.

SVM-Prot 2016: A Web-Server for Machine Learning Prediction of Protein Functional Families from Sequence Irrespective of Similarity.

Science.gov (United States)

Li, Ying Hong; Xu, Jing Yu; Tao, Lin; Li, Xiao Feng; Li, Shuang; Zeng, Xian; Chen, Shang Ying; Zhang, Peng; Qin, Chu; Zhang, Cheng; Chen, Zhe; Zhu, Feng; Chen, Yu Zong

2016-01-01

Knowledge of protein function is important for biological, medical and therapeutic studies, but many proteins are still unknown in function. There is a need for more improved functional prediction methods. Our SVM-Prot web-server employed a machine learning method for predicting protein functional families from protein sequences irrespective of similarity, which complemented those similarity-based and other methods in predicting diverse classes of proteins including the distantly-related proteins and homologous proteins of different functions. Since its publication in 2003, we made major improvements to SVM-Prot with (1) expanded coverage from 54 to 192 functional families, (2) more diverse protein descriptors protein representation, (3) improved predictive performances due to the use of more enriched training datasets and more variety of protein descriptors, (4) newly integrated BLAST analysis option for assessing proteins in the SVM-Prot predicted functional families that were similar in sequence to a query protein, and (5) newly added batch submission option for supporting the classification of multiple proteins. Moreover, 2 more machine learning approaches, K nearest neighbor and probabilistic neural networks, were added for facilitating collective assessment of protein functions by multiple methods. SVM-Prot can be accessed at http://bidd2.nus.edu.sg/cgi-bin/svmprot/svmprot.cgi.
Características e rendimentos de carcaça e de cortes em ovinos Santa Inês, alimentados com diferentes concentrações de energia metabolizável - doi: 10.4025/actascianimsci.v32i4.9684 Characteristics and yields of carcass and cuts in Santa Ines sheep fed with different concentrations of metabolizable energy - doi: 10.4025/actascianimsci.v32i4.9684

Directory of Open Access Journals (Sweden)

José Gilson Louzada Regadas Filho

2010-10-01

Full Text Available O objetivo do estudo foi avaliar o ganho de peso (GMD, conversão alimentar (CA, eficiência alimentar (EA, características de carcaça e dos cortes comerciais de ovinos Santa Inês, alimentados com diferentes concentrações de energia metabolizável (2,08; 2,28; 2,47 e 2,69 Mcal de EM kg-1 de MS. Vinte cordeiros, com idade e peso corporal médio de 50 dias e 13,00 ± 0,56 kg, respectivamente, foram distribuídos em blocos casualizados, com cinco repetições. Verificou-se efeito linear crescente (p 0,05 pelos níveis energéticos das rações. No entanto, os pesos de carcaça quente e fria, e o peso do corpo vazio, expressos em kg, apresentaram efeito quadrático (p This study evaluated the weight gain (ADG, feed conversion (FC, feed efficiency (FE, characteristics of carcass and retail cuts of Santa Inês sheep fed different levels of metabolizable energy (2.08, 2.28, 2.47 and 2.69 Mcal kg-1 of DM. Twenty lambs, with age and mean body weight of 50 days and 13 ± 0.56 kg, respectively, were distributed in randomized block design with five replications. We verified a linear increase effect (p 0.05 by the energy levels of the rations. Nevertheless, the weights of hot and cold carcass and theempty body , expressed in kg, presented quadratic effect (p < 0.05, as we increase the levels of metabolizable energy in experimental diets. The energy levels influenced the yield of rib and shoulder, as the loin eye area (p < 0.05. In conclusion, the manipulation of the energy level of the ration changes the ADG, the hot and cold carcass weight, the shoulder yield, the rib weight and the loin eye area of Santa Ines sheep.
Optimal neural networks for protein-structure prediction

International Nuclear Information System (INIS)

Head-Gordon, T.; Stillinger, F.H.

1993-01-01

The successful application of neural-network algorithms for prediction of protein structure is stymied by three problem areas: the sparsity of the database of known protein structures, poorly devised network architectures which make the input-output mapping opaque, and a global optimization problem in the multiple-minima space of the network variables. We present a simplified polypeptide model residing in two dimensions with only two amino-acid types, A and B, which allows the determination of the global energy structure for all possible sequences of pentamer, hexamer, and heptamer lengths. This model simplicity allows us to compile a complete structural database and to devise neural networks that reproduce the tertiary structure of all sequences with absolute accuracy and with the smallest number of network variables. These optimal networks reveal that the three problem areas are convoluted, but that thoughtful network designs can actually deconvolute these detrimental traits to provide network algorithms that genuinely impact on the ability of the network to generalize or learn the desired mappings. Furthermore, the two-dimensional polypeptide model shows sufficient chemical complexity so that transfer of neural-network technology to more realistic three-dimensional proteins is evident
Composition, Shell Strength, and Metabolizable Energy of Mulinia lateralis and Ischadium recurvum as Food for Wintering Surf Scoters (Melanitta perspicillata.

Directory of Open Access Journals (Sweden)

Alicia M Wells-Berlin

Full Text Available Decline in surf scoter (Melanitta perspicillata waterfowl populations wintering in the Chesapeake Bay has been associated with changes in the availability of benthic bivalves. The Bay has become more eutrophic, causing changes in the benthos available to surf scoters. The subsequent decline in oyster beds (Crassostrea virginica has reduced the hard substrate needed by the hooked mussel (Ischadium recurvum, one of the primary prey items for surf scoters, causing the surf scoter to switch to a more opportune species, the dwarf surfclam (Mulinia lateralis. The composition (macronutrients, minerals, and amino acids, shell strength (N, and metabolizable energy (kJ of these prey items were quantified to determine the relative foraging values for wintering scoters. Pooled samples of each prey item were analyzed to determine composition. Shell strength (N was measured using a shell crack compression test. Total collection digestibility trials were conducted on eight captive surf scoters. For the prey size range commonly consumed by surf scoters (6-12 mm for M. lateralis and 18-24 mm for I. recurvum, I. recurvum contained higher ash, protein, lipid, and energy per individual organism than M. lateralis. I. recurvum required significantly greater force to crack the shell relative to M. lateralis. No difference in metabolized energy was observed for these prey items in wintering surf scoters, despite I. recurvum's higher ash content and harder shell than M. lateralis. Therefore, wintering surf scoters were able to obtain the same amount of energy from each prey item, implying that they can sustain themselves if forced to switch prey.
Prediction of Protein Hotspots from Whole Protein Sequences by a Random Projection Ensemble System

Directory of Open Access Journals (Sweden)

Jinjian Jiang

2017-07-01

Full Text Available Hotspot residues are important in the determination of protein-protein interactions, and they always perform specific functions in biological processes. The determination of hotspot residues is by the commonly-used method of alanine scanning mutagenesis experiments, which is always costly and time consuming. To address this issue, computational methods have been developed. Most of them are structure based, i.e., using the information of solved protein structures. However, the number of solved protein structures is extremely less than that of sequences. Moreover, almost all of the predictors identified hotspots from the interfaces of protein complexes, seldom from the whole protein sequences. Therefore, determining hotspots from whole protein sequences by sequence information alone is urgent. To address the issue of hotspot predictions from the whole sequences of proteins, we proposed an ensemble system with random projections using statistical physicochemical properties of amino acids. First, an encoding scheme involving sequence profiles of residues and physicochemical properties from the AAindex1 dataset is developed. Then, the random projection technique was adopted to project the encoding instances into a reduced space. Then, several better random projections were obtained by training an IBk classifier based on the training dataset, which were thus applied to the test dataset. The ensemble of random projection classifiers is therefore obtained. Experimental results showed that although the performance of our method is not good enough for real applications of hotspots, it is very promising in the determination of hotspot residues from whole sequences.
Why Is There a Glass Ceiling for Threading Based Protein Structure Prediction Methods?

Science.gov (United States)

Skolnick, Jeffrey; Zhou, Hongyi

2017-04-20

Despite their different implementations, comparison of the best threading approaches to the prediction of evolutionary distant protein structures reveals that they tend to succeed or fail on the same protein targets. This is true despite the fact that the structural template library has good templates for all cases. Thus, a key question is why are certain protein structures threadable while others are not. Comparison with threading results on a set of artificial sequences selected for stability further argues that the failure of threading is due to the nature of the protein structures themselves. Using a new contact map based alignment algorithm, we demonstrate that certain folds are highly degenerate in that they can have very similar coarse grained fractions of native contacts aligned and yet differ significantly from the native structure. For threadable proteins, this is not the case. Thus, contemporary threading approaches appear to have reached a plateau, and new approaches to structure prediction are required.
Exploiting the Past and the Future in Protein Secondary Structure Prediction

DEFF Research Database (Denmark)

Baldi, Pierre; Brunak, Søren; Frasconi, P

1999-01-01

predictions based on variable ranges of dependencies. These architectures extend recurrent neural networks, introducing non-causal bidirectional dynamics to capture both upstream and downstream information. The prediction algorithm is completed by the use of mixtures of estimators that leverage evolutionary......Motivation: Predicting the secondary structure of a protein (alpha-helix, beta-sheet, coil) is an important step towards elucidating its three-dimensional structure, as well as its function. Presently, the best predictors are based on machine learning approaches, in particular neural network...
Prediction of human protein function from post-translational modifications and localization features

DEFF Research Database (Denmark)

Jensen, Lars Juhl; Gupta, Ramneek; Blom, Nikolaj

2002-01-01

a number of functional attributes that are more directly related to the linear sequence of amino acids, and hence easier to predict, than protein structure. These attributes include features associated with post-translational modifications and protein sorting, but also much simpler aspects......We have developed an entirely sequence-based method that identifies and integrates relevant features that can be used to assign proteins of unknown function to functional classes, and enzyme categories for enzymes. We show that strategies for the elucidation of protein function may benefit from...
Large-scale prediction of drug–target interactions using protein sequences and drug topological structures

International Nuclear Information System (INIS)

Cao Dongsheng; Liu Shao; Xu Qingsong; Lu Hongmei; Huang Jianhua; Hu Qiannan; Liang Yizeng

2012-01-01

Highlights: ► Drug–target interactions are predicted using an extended SAR methodology. ► A drug–target interaction is regarded as an event triggered by many factors. ► Molecular fingerprint and CTD descriptors are used to represent drugs and proteins. ► Our approach shows compatibility between the new scheme and current SAR methodology. - Abstract: The identification of interactions between drugs and target proteins plays a key role in the process of genomic drug discovery. It is both consuming and costly to determine drug–target interactions by experiments alone. Therefore, there is an urgent need to develop new in silico prediction approaches capable of identifying these potential drug–target interactions in a timely manner. In this article, we aim at extending current structure–activity relationship (SAR) methodology to fulfill such requirements. In some sense, a drug–target interaction can be regarded as an event or property triggered by many influence factors from drugs and target proteins. Thus, each interaction pair can be represented theoretically by using these factors which are based on the structural and physicochemical properties simultaneously from drugs and proteins. To realize this, drug molecules are encoded with MACCS substructure fingerings representing existence of certain functional groups or fragments; and proteins are encoded with some biochemical and physicochemical properties. Four classes of drug–target interaction networks in humans involving enzymes, ion channels, G-protein-coupled receptors (GPCRs) and nuclear receptors, are independently used for establishing predictive models with support vector machines (SVMs). The SVM models gave prediction accuracy of 90.31%, 88.91%, 84.68% and 83.74% for four datasets, respectively. In conclusion, the results demonstrate the ability of our proposed method to predict the drug–target interactions, and show a general compatibility between the new scheme and current SAR
Large-scale prediction of drug-target interactions using protein sequences and drug topological structures

Energy Technology Data Exchange (ETDEWEB)

Cao Dongsheng [Research Center of Modernization of Traditional Chinese Medicines, Central South University, Changsha 410083 (China); Liu Shao [Xiangya Hospital, Central South University, Changsha 410008 (China); Xu Qingsong [School of Mathematical Sciences and Computing Technology, Central South University, Changsha 410083 (China); Lu Hongmei; Huang Jianhua [Research Center of Modernization of Traditional Chinese Medicines, Central South University, Changsha 410083 (China); Hu Qiannan [Key Laboratory of Combinatorial Biosynthesis and Drug Discovery (Wuhan University), Ministry of Education, and Wuhan University School of Pharmaceutical Sciences, Wuhan 430071 (China); Liang Yizeng, E-mail: yizeng_liang@263.net [Research Center of Modernization of Traditional Chinese Medicines, Central South University, Changsha 410083 (China)

2012-11-08

Highlights: Black-Right-Pointing-Pointer Drug-target interactions are predicted using an extended SAR methodology. Black-Right-Pointing-Pointer A drug-target interaction is regarded as an event triggered by many factors. Black-Right-Pointing-Pointer Molecular fingerprint and CTD descriptors are used to represent drugs and proteins. Black-Right-Pointing-Pointer Our approach shows compatibility between the new scheme and current SAR methodology. - Abstract: The identification of interactions between drugs and target proteins plays a key role in the process of genomic drug discovery. It is both consuming and costly to determine drug-target interactions by experiments alone. Therefore, there is an urgent need to develop new in silico prediction approaches capable of identifying these potential drug-target interactions in a timely manner. In this article, we aim at extending current structure-activity relationship (SAR) methodology to fulfill such requirements. In some sense, a drug-target interaction can be regarded as an event or property triggered by many influence factors from drugs and target proteins. Thus, each interaction pair can be represented theoretically by using these factors which are based on the structural and physicochemical properties simultaneously from drugs and proteins. To realize this, drug molecules are encoded with MACCS substructure fingerings representing existence of certain functional groups or fragments; and proteins are encoded with some biochemical and physicochemical properties. Four classes of drug-target interaction networks in humans involving enzymes, ion channels, G-protein-coupled receptors (GPCRs) and nuclear receptors, are independently used for establishing predictive models with support vector machines (SVMs). The SVM models gave prediction accuracy of 90.31%, 88.91%, 84.68% and 83.74% for four datasets, respectively. In conclusion, the results demonstrate the ability of our proposed method to predict the drug
Prediction of protein structural classes by recurrence quantification analysis based on chaos game representation.

Science.gov (United States)

Yang, Jian-Yi; Peng, Zhen-Ling; Yu, Zu-Guo; Zhang, Rui-Jie; Anh, Vo; Wang, Desheng

2009-04-21

In this paper, we intend to predict protein structural classes (alpha, beta, alpha+beta, or alpha/beta) for low-homology data sets. Two data sets were used widely, 1189 (containing 1092 proteins) and 25PDB (containing 1673 proteins) with sequence homology being 40% and 25%, respectively. We propose to decompose the chaos game representation of proteins into two kinds of time series. Then, a novel and powerful nonlinear analysis technique, recurrence quantification analysis (RQA), is applied to analyze these time series. For a given protein sequence, a total of 16 characteristic parameters can be calculated with RQA, which are treated as feature representation of protein sequences. Based on such feature representation, the structural class for each protein is predicted with Fisher's linear discriminant algorithm. The jackknife test is used to test and compare our method with other existing methods. The overall accuracies with step-by-step procedure are 65.8% and 64.2% for 1189 and 25PDB data sets, respectively. With one-against-others procedure used widely, we compare our method with five other existing methods. Especially, the overall accuracies of our method are 6.3% and 4.1% higher for the two data sets, respectively. Furthermore, only 16 parameters are used in our method, which is less than that used by other methods. This suggests that the current method may play a complementary role to the existing methods and is promising to perform the prediction of protein structural classes.
A review of machine learning methods to predict the solubility of overexpressed recombinant proteins in Escherichia coli.

Science.gov (United States)

Habibi, Narjeskhatoon; Mohd Hashim, Siti Z; Norouzi, Alireza; Samian, Mohammed Razip

2014-05-08

Over the last 20 years in biotechnology, the production of recombinant proteins has been a crucial bioprocess in both biopharmaceutical and research arena in terms of human health, scientific impact and economic volume. Although logical strategies of genetic engineering have been established, protein overexpression is still an art. In particular, heterologous expression is often hindered by low level of production and frequent fail due to opaque reasons. The problem is accentuated because there is no generic solution available to enhance heterologous overexpression. For a given protein, the extent of its solubility can indicate the quality of its function. Over 30% of synthesized proteins are not soluble. In certain experimental circumstances, including temperature, expression host, etc., protein solubility is a feature eventually defined by its sequence. Until now, numerous methods based on machine learning are proposed to predict the solubility of protein merely from its amino acid sequence. In spite of the 20 years of research on the matter, no comprehensive review is available on the published methods. This paper presents an extensive review of the existing models to predict protein solubility in Escherichia coli recombinant protein overexpression system. The models are investigated and compared regarding the datasets used, features, feature selection methods, machine learning techniques and accuracy of prediction. A discussion on the models is provided at the end. This study aims to investigate extensively the machine learning based methods to predict recombinant protein solubility, so as to offer a general as well as a detailed understanding for researches in the field. Some of the models present acceptable prediction performances and convenient user interfaces. These models can be considered as valuable tools to predict recombinant protein overexpression results before performing real laboratory experiments, thus saving labour, time and cost.
Application of long-range order to predict unfolding rates of two-state proteins.

Science.gov (United States)

Harihar, B; Selvaraj, S

2011-03-01

Predicting the experimental unfolding rates of two-state proteins and models describing the unfolding rates of these proteins is quite limited because of the complexity present in the unfolding mechanism and the lack of experimental unfolding data compared with folding data. In this work, 25 two-state proteins characterized by Maxwell et al. (Protein Sci 2005;14:602–616) using a consensus set of experimental conditions were taken, and the parameter long-range order (LRO) derived from their three-dimensional structures were related with their experimental unfolding rates ln(k(u)). From the total data set of 30 proteins used by Maxwell et al. (Protein Sci 2005;14:602–616), five slow-unfolding proteins with very low unfolding rates were considered to be outliers and were not included in our data set. Except all beta structural class, LRO of both the all-alpha and mixed-class proteins showed a strong inverse correlation of r = -0.99 and -0.88, respectively, with experimental ln(k(u)). LRO shows a correlation of -0.62 with experimental ln(k(u)) for all-beta proteins. For predicting the unfolding rates, a simple statistical method has been used and linear regression equations were developed for individual structural classes of proteins using LRO, and the results obtained showed a better agreement with experimental results. Copyright © 2010 Wiley-Liss, Inc.
PON-Sol: prediction of effects of amino acid substitutions on protein solubility.

Science.gov (United States)

Yang, Yang; Niroula, Abhishek; Shen, Bairong; Vihinen, Mauno

2016-07-01

Solubility is one of the fundamental protein properties. It is of great interest because of its relevance to protein expression. Reduced solubility and protein aggregation are also associated with many diseases. We collected from literature the largest experimentally verified solubility affecting amino acid substitution (AAS) dataset and used it to train a predictor called PON-Sol. The predictor can distinguish both solubility decreasing and increasing variants from those not affecting solubility. PON-Sol has normalized correct prediction ratio of 0.491 on cross-validation and 0.432 for independent test set. The performance of the method was compared both to solubility and aggregation predictors and found to be superior. PON-Sol can be used for the prediction of effects of disease-related substitutions, effects on heterologous recombinant protein expression and enhanced crystallizability. One application is to investigate effects of all possible AASs in a protein to aid protein engineering. PON-Sol is freely available at http://structure.bmc.lu.se/PON-Sol The training and test data are available at http://structure.bmc.lu.se/VariBench/ponsol.php mauno.vihinen@med.lu.se Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
ESLpred2: improved method for predicting subcellular localization of eukaryotic proteins

Directory of Open Access Journals (Sweden)

Raghava Gajendra PS

2008-11-01

Full Text Available Abstract Background The expansion of raw protein sequence databases in the post genomic era and availability of fresh annotated sequences for major localizations particularly motivated us to introduce a new improved version of our previously forged eukaryotic subcellular localizations prediction method namely "ESLpred". Since, subcellular localization of a protein offers essential clues about its functioning, hence, availability of localization predictor would definitely aid and expedite the protein deciphering studies. However, robustness of a predictor is highly dependent on the superiority of dataset and extracted protein attributes; hence, it becomes imperative to improve the performance of presently available method using latest dataset and crucial input features. Results Here, we describe augmentation in the prediction performance obtained for our most popular ESLpred method using new crucial features as an input to Support Vector Machine (SVM. In addition, recently available, highly non-redundant dataset encompassing three kingdoms specific protein sequence sets; 1198 fungi sequences, 2597 from animal and 491 plant sequences were also included in the present study. First, using the evolutionary information in the form of profile composition along with whole and N-terminal sequence composition as an input feature vector of 440 dimensions, overall accuracies of 72.7, 75.8 and 74.5% were achieved respectively after five-fold cross-validation. Further, enhancement in performance was observed when similarity search based results were coupled with whole and N-terminal sequence composition along with profile composition by yielding overall accuracies of 75.9, 80.8, 76.6% respectively; best accuracies reported till date on the same datasets. Conclusion These results provide confidence about the reliability and accurate prediction of SVM modules generated in the present study using sequence and profile compositions along with similarity search
Starch digestibility and apparent metabolizable energy of western Canadian wheat market classes in broiler chickens.

Science.gov (United States)

Karunaratne, N D; Abbott, D A; Hucl, P J; Chibbar, R N; Pozniak, C J; Classen, H L

2018-05-16

Wheat is the primary grain fed to poultry in western Canada, but its nutritional quality, including the nature of its starch digestibility, may be affected by wheat market class. The objectives of this study were to determine the rate and extent of starch digestibility of wheat market classes in broiler chickens, and to determine the relationship between starch digestibility and wheat apparent metabolizable energy (AME). In vitro starch digestion was assessed using gastric and small intestinal phases mimicking the chicken digestive tract, while in vivo evaluation used 468 male broiler chickens randomly assigned to dietary treatments from 0 to 21 d of age. The study evaluated 2 wheat cultivars from each of 6 western Canadian wheat classes: Canadian Prairie Spring (CPS), Canadian Western Amber Durum (CWAD), CW General Purpose (CWGP), CW Hard White Spring (CWHWS), CW Red Spring (CWRS), and CW Soft White Spring (CWSWS). All samples were analyzed for relevant grain characteristics. Data were analyzed as a randomized complete block design and cultivars were nested within market class. Pearson correlation was used to determine relationships between measured characteristics. Significance level was P ≤ 0.05. The starch digestibility range and wheat class rankings were: proximal jejunum - 23.7 to 50.6% (CWHWSc, CPSbc, CWSWSbc, CWRSab, CWGPa, CWADa); distal jejunum - 63.5 to 76.4% (CWHWSc, CPSbc, CWSWSbc, CWRSab, CWGPa, CWADa); proximal ileum - 88.7 to 96.9% (CWSWSc, CPSbc, CWHWSbc, CWRSb, CWGPb, CWADa); distal ileum - 94.4 to 98.5% (CWSWSb, CWHWSb, CPSb, CWRSab, CWGPab, CWADa); excreta - 98.4 to 99.3% (CPSb, CWRSb, CWHWSb, CWSWSab, CWGPab, CWADa). Wheat class affected wheat AMEn with levels ranging from 3,203 to 3,411 kcal/kg at 90% DM (CWRSc, CWSWSc, CPSb, CWGPb, CWADa, CWHWSa). Significant and moderately strong positive correlations were observed between in vitro and in vivo starch digestibility, but no correlations were found between AME and starch digestibility. In
Prediction of glutathionylation sites in proteins using minimal sequence information and their experimental validation.

Science.gov (United States)

Pal, Debojyoti; Sharma, Deepak; Kumar, Mukesh; Sandur, Santosh K

2016-09-01

S-glutathionylation of proteins plays an important role in various biological processes and is known to be protective modification during oxidative stress. Since, experimental detection of S-glutathionylation is labor intensive and time consuming, bioinformatics based approach is a viable alternative. Available methods require relatively longer sequence information, which may prevent prediction if sequence information is incomplete. Here, we present a model to predict glutathionylation sites from pentapeptide sequences. It is based upon differential association of amino acids with glutathionylated and non-glutathionylated cysteines from a database of experimentally verified sequences. This data was used to calculate position dependent F-scores, which measure how a particular amino acid at a particular position may affect the likelihood of glutathionylation event. Glutathionylation-score (G-score), indicating propensity of a sequence to undergo glutathionylation, was calculated using position-dependent F-scores for each amino-acid. Cut-off values were used for prediction. Our model returned an accuracy of 58% with Matthew's correlation-coefficient (MCC) value of 0.165. On an independent dataset, our model outperformed the currently available model, in spite of needing much less sequence information. Pentapeptide motifs having high abundance among glutathionylated proteins were identified. A list of potential glutathionylation hotspot sequences were obtained by assigning G-scores and subsequent Protein-BLAST analysis revealed a total of 254 putative glutathionable proteins, a number of which were already known to be glutathionylated. Our model predicted glutathionylation sites in 93.93% of experimentally verified glutathionylated proteins. Outcome of this study may assist in discovering novel glutathionylation sites and finding candidate proteins for glutathionylation.
A physiologically based pharmacokinetic model to predict the pharmacokinetics of highly protein-bound drugs and the impact of errors in plasma protein binding.

Science.gov (United States)

Ye, Min; Nagar, Swati; Korzekwa, Ken

2016-04-01

Predicting the pharmacokinetics of highly protein-bound drugs is difficult. Also, since historical plasma protein binding data were often collected using unbuffered plasma, the resulting inaccurate binding data could contribute to incorrect predictions. This study uses a generic physiologically based pharmacokinetic (PBPK) model to predict human plasma concentration-time profiles for 22 highly protein-bound drugs. Tissue distribution was estimated from in vitro drug lipophilicity data, plasma protein binding and the blood: plasma ratio. Clearance was predicted with a well-stirred liver model. Underestimated hepatic clearance for acidic and neutral compounds was corrected by an empirical scaling factor. Predicted values (pharmacokinetic parameters, plasma concentration-time profile) were compared with observed data to evaluate the model accuracy. Of the 22 drugs, less than a 2-fold error was obtained for the terminal elimination half-life (t1/2 , 100% of drugs), peak plasma concentration (Cmax , 100%), area under the plasma concentration-time curve (AUC0-t , 95.4%), clearance (CLh , 95.4%), mean residence time (MRT, 95.4%) and steady state volume (Vss , 90.9%). The impact of fup errors on CLh and Vss prediction was evaluated. Errors in fup resulted in proportional errors in clearance prediction for low-clearance compounds, and in Vss prediction for high-volume neutral drugs. For high-volume basic drugs, errors in fup did not propagate to errors in Vss prediction. This is due to the cancellation of errors in the calculations for tissue partitioning of basic drugs. Overall, plasma profiles were well simulated with the present PBPK model. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Effect of dietary protein levels on growth performance, mortality rate and clinical blood parameters in mink (Mustela vison)

DEFF Research Database (Denmark)

Damgaard, B.M.; Clausen, T.N.; Dietz, Hans Henrik

1998-01-01

Effects of dietary protein levels ranging from 35% to 15% of metabolizable energy (ME) and dietary fat levels ranging in a reciprocal fashion from 47% to 67% of ME, and a constant dietary carbohydrate level of 18% of ME were investigated in male mink kits in the growing-furring period. Growth...... performance, mortality rate, hepatic fatty infiltration, weights of body and liver, relative weight of liver, haematocrit values, plasma activities of alanine-aminotransferase (ALAT), aspartate-aminotransferase (ASAT) and creatine-kinase (CK), and plasma concentrations of chemical parameters were studied...
SPEER-SERVER: a web server for prediction of protein specificity determining sites.

Science.gov (United States)

Chakraborty, Abhijit; Mandloi, Sapan; Lanczycki, Christopher J; Panchenko, Anna R; Chakrabarti, Saikat

2012-07-01

Sites that show specific conservation patterns within subsets of proteins in a protein family are likely to be involved in the development of functional specificity. These sites, generally termed specificity determining sites (SDS), might play a crucial role in binding to a specific substrate or proteins. Identification of SDS through experimental techniques is a slow, difficult and tedious job. Hence, it is very important to develop efficient computational methods that can more expediently identify SDS. Herein, we present Specificity prediction using amino acids' Properties, Entropy and Evolution Rate (SPEER)-SERVER, a web server that predicts SDS by analyzing quantitative measures of the conservation patterns of protein sites based on their physico-chemical properties and the heterogeneity of evolutionary changes between and within the protein subfamilies. This web server provides an improved representation of results, adds useful input and output options and integrates a wide range of analysis and data visualization tools when compared with the original standalone version of the SPEER algorithm. Extensive benchmarking finds that SPEER-SERVER exhibits sensitivity and precision performance that, on average, meets or exceeds that of other currently available methods. SPEER-SERVER is available at http://www.hpppi.iicb.res.in/ss/.

Support vector machines for prediction and analysis of beta and gamma-turns in proteins.

Science.gov (United States)

Pham, Tho Hoan; Satou, Kenji; Ho, Tu Bao

2005-04-01

Tight turns have long been recognized as one of the three important features of proteins, together with alpha-helix and beta-sheet. Tight turns play an important role in globular proteins from both the structural and functional points of view. More than 90% tight turns are beta-turns and most of the rest are gamma-turns. Analysis and prediction of beta-turns and gamma-turns is very useful for design of new molecules such as drugs, pesticides, and antigens. In this paper we investigated two aspects of applying support vector machine (SVM), a promising machine learning method for bioinformatics, to prediction and analysis of beta-turns and gamma-turns. First, we developed two SVM-based methods, called BTSVM and GTSVM, which predict beta-turns and gamma-turns in a protein from its sequence. When compared with other methods, BTSVM has a superior performance and GTSVM is competitive. Second, we used SVMs with a linear kernel to estimate the support of amino acids for the formation of beta-turns and gamma-turns depending on their position in a protein. Our analysis results are more comprehensive and easier to use than the previous results in designing turns in proteins.
Protein Tertiary Structure Prediction Based on Main Chain Angle Using a Hybrid Bees Colony Optimization Algorithm

Science.gov (United States)

Mahmood, Zakaria N.; Mahmuddin, Massudi; Mahmood, Mohammed Nooraldeen

Encoding proteins of amino acid sequence to predict classified into their respective families and subfamilies is important research area. However for a given protein, knowing the exact action whether hormonal, enzymatic, transmembranal or nuclear receptors does not depend solely on amino acid sequence but on the way the amino acid thread folds as well. This study provides a prototype system that able to predict a protein tertiary structure. Several methods are used to develop and evaluate the system to produce better accuracy in protein 3D structure prediction. The Bees Optimization algorithm which inspired from the honey bees food foraging method, is used in the searching phase. In this study, the experiment is conducted on short sequence proteins that have been used by the previous researches using well-known tools. The proposed approach shows a promising result.
TMFoldWeb: a web server for predicting transmembrane protein fold class.

Science.gov (United States)

Kozma, Dániel; Tusnády, Gábor E

2015-09-17

Here we present TMFoldWeb, the web server implementation of TMFoldRec, a transmembrane protein fold recognition algorithm. TMFoldRec uses statistical potentials and utilizes topology filtering and a gapless threading algorithm. It ranks template structures and selects the most likely candidates and estimates the reliability of the obtained lowest energy model. The statistical potential was developed in a maximum likelihood framework on a representative set of the PDBTM database. According to the benchmark test the performance of TMFoldRec is about 77 % in correctly predicting fold class for a given transmembrane protein sequence. An intuitive web interface has been developed for the recently published TMFoldRec algorithm. The query sequence goes through a pipeline of topology prediction and a systematic sequence to structure alignment (threading). Resulting templates are ordered by energy and reliability values and are colored according to their significance level. Besides the graphical interface, a programmatic access is available as well, via a direct interface for developers or for submitting genome-wide data sets. The TMFoldWeb web server is unique and currently the only web server that is able to predict the fold class of transmembrane proteins while assigning reliability scores for the prediction. This method is prepared for genome-wide analysis with its easy-to-use interface, informative result page and programmatic access. Considering the info-communication evolution in the last few years, the developed web server, as well as the molecule viewer, is responsive and fully compatible with the prevalent tablets and mobile devices.
TALOS+: a hybrid method for predicting protein backbone torsion angles from NMR chemical shifts

Energy Technology Data Exchange (ETDEWEB)

Shen Yang; Delaglio, Frank [National Institutes of Health, Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases (United States); Cornilescu, Gabriel [National Magnetic Resonance Facility (United States); Bax, Ad [National Institutes of Health, Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases (United States)], E-mail: bax@nih.gov

2009-08-15

NMR chemical shifts in proteins depend strongly on local structure. The program TALOS establishes an empirical relation between {sup 13}C, {sup 15}N and {sup 1}H chemical shifts and backbone torsion angles {phi} and {psi} (Cornilescu et al. J Biomol NMR 13 289-302, 1999). Extension of the original 20-protein database to 200 proteins increased the fraction of residues for which backbone angles could be predicted from 65 to 74%, while reducing the error rate from 3 to 2.5%. Addition of a two-layer neural network filter to the database fragment selection process forms the basis for a new program, TALOS+, which further enhances the prediction rate to 88.5%, without increasing the error rate. Excluding the 2.5% of residues for which TALOS+ makes predictions that strongly differ from those observed in the crystalline state, the accuracy of predicted {phi} and {psi} angles, equals {+-}13{sup o}. Large discrepancies between predictions and crystal structures are primarily limited to loop regions, and for the few cases where multiple X-ray structures are available such residues are often found in different states in the different structures. The TALOS+ output includes predictions for individual residues with missing chemical shifts, and the neural network component of the program also predicts secondary structure with good accuracy.
An efficient heuristic method for active feature acquisition and its application to protein-protein interaction prediction

Directory of Open Access Journals (Sweden)

Thahir Mohamed

2012-11-01

Full Text Available Abstract Background Machine learning approaches for classification learn the pattern of the feature space of different classes, or learn a boundary that separates the feature space into different classes. The features of the data instances are usually available, and it is only the class-labels of the instances that are unavailable. For example, to classify text documents into different topic categories, the words in the documents are features and they are readily available, whereas the topic is what is predicted. However, in some domains obtaining features may be resource-intensive because of which not all features may be available. An example is that of protein-protein interaction prediction, where not only are the labels ('interacting' or 'non-interacting' unavailable, but so are some of the features. It may be possible to obtain at least some of the missing features by carrying out a few experiments as permitted by the available resources. If only a few experiments can be carried out to acquire missing features, which proteins should be studied and which features of those proteins should be determined? From the perspective of machine learning for PPI prediction, it would be desirable that those features be acquired which when used in training the classifier, the accuracy of the classifier is improved the most. That is, the utility of the feature-acquisition is measured in terms of how much acquired features contribute to improving the accuracy of the classifier. Active feature acquisition (AFA is a strategy to preselect such instance-feature combinations (i.e. protein and experiment combinations for maximum utility. The goal of AFA is the creation of optimal training set that would result in the best classifier, and not in determining the best classification model itself. Results We present a heuristic method for active feature acquisition to calculate the utility of acquiring a missing feature. This heuristic takes into account the change in
Protein Solvent-Accessibility Prediction by a Stacked Deep Bidirectional Recurrent Neural Network

Directory of Open Access Journals (Sweden)

Buzhong Zhang

2018-05-01

Full Text Available Residue solvent accessibility is closely related to the spatial arrangement and packing of residues. Predicting the solvent accessibility of a protein is an important step to understand its structure and function. In this work, we present a deep learning method to predict residue solvent accessibility, which is based on a stacked deep bidirectional recurrent neural network applied to sequence profiles. To capture more long-range sequence information, a merging operator was proposed when bidirectional information from hidden nodes was merged for outputs. Three types of merging operators were used in our improved model, with a long short-term memory network performing as a hidden computing node. The trained database was constructed from 7361 proteins extracted from the PISCES server using a cut-off of 25% sequence identity. Sequence-derived features including position-specific scoring matrix, physical properties, physicochemical characteristics, conservation score and protein coding were used to represent a residue. Using this method, predictive values of continuous relative solvent-accessible area were obtained, and then, these values were transformed into binary states with predefined thresholds. Our experimental results showed that our deep learning method improved prediction quality relative to current methods, with mean absolute error and Pearson’s correlation coefficient values of 8.8% and 74.8%, respectively, on the CB502 dataset and 8.2% and 78%, respectively, on the Manesh215 dataset.
Protein Solvent-Accessibility Prediction by a Stacked Deep Bidirectional Recurrent Neural Network.

Science.gov (United States)

Zhang, Buzhong; Li, Linqing; Lü, Qiang

2018-05-25

Residue solvent accessibility is closely related to the spatial arrangement and packing of residues. Predicting the solvent accessibility of a protein is an important step to understand its structure and function. In this work, we present a deep learning method to predict residue solvent accessibility, which is based on a stacked deep bidirectional recurrent neural network applied to sequence profiles. To capture more long-range sequence information, a merging operator was proposed when bidirectional information from hidden nodes was merged for outputs. Three types of merging operators were used in our improved model, with a long short-term memory network performing as a hidden computing node. The trained database was constructed from 7361 proteins extracted from the PISCES server using a cut-off of 25% sequence identity. Sequence-derived features including position-specific scoring matrix, physical properties, physicochemical characteristics, conservation score and protein coding were used to represent a residue. Using this method, predictive values of continuous relative solvent-accessible area were obtained, and then, these values were transformed into binary states with predefined thresholds. Our experimental results showed that our deep learning method improved prediction quality relative to current methods, with mean absolute error and Pearson's correlation coefficient values of 8.8% and 74.8%, respectively, on the CB502 dataset and 8.2% and 78%, respectively, on the Manesh215 dataset.
A Bipartite Network-based Method for Prediction of Long Non-coding RNA–protein Interactions

Directory of Open Access Journals (Sweden)

Mengqu Ge

2016-02-01

Full Text Available As one large class of non-coding RNAs (ncRNAs, long ncRNAs (lncRNAs have gained considerable attention in recent years. Mutations and dysfunction of lncRNAs have been implicated in human disorders. Many lncRNAs exert their effects through interactions with the corresponding RNA-binding proteins. Several computational approaches have been developed, but only few are able to perform the prediction of these interactions from a network-based point of view. Here, we introduce a computational method named lncRNA–protein bipartite network inference (LPBNI. LPBNI aims to identify potential lncRNA–interacting proteins, by making full use of the known lncRNA–protein interactions. Leave-one-out cross validation (LOOCV test shows that LPBNI significantly outperforms other network-based methods, including random walk (RWR and protein-based collaborative filtering (ProCF. Furthermore, a case study was performed to demonstrate the performance of LPBNI using real data in predicting potential lncRNA–interacting proteins.
Urea recycling contributes to nitrogen retention in calves fed milk replacer and low-protein solid feed

DEFF Research Database (Denmark)

Berends, Harma; van den Borne, Joost J G C; Røjen, Betina A.

2014-01-01

Urea recycling, with urea originating from catabolism of amino acids and hepatic detoxification of ammonia, is particularly relevant for ruminant animals, in which microbial protein contributes substantially to the metabolizable protein supply. However, the quantitative contribution of urea...... recycling to protein anabolism in calves during the transition from preruminants (milk-fed calves) to ruminants [solid feed (SF)-fed calves] is unknown. The aim of this study was to quantify urea recycling in milk-fed calves when provided with low-protein SF. Forty-eight calves [164 ± 1.6 kg body weight (BW......)] were assigned to 1 of 4 SF levels [0, 9, 18, and 27 g of dry matter (DM) SF · kg BW2-0.75 . d-1] provided in addition to an identical amount of milk replacer. Urea recycling was quantified after a 24-h intravenous infusion of [15N2]urea by analyzing urea isotopomers in 68-h fecal and urinary...
Simplified Method for Predicting a Functional Class of Proteins in Transcription Factor Complexes

KAUST Repository

Piatek, Marek J.; Schramm, Michael C.; Burra, Dharani Dhar; BinShbreen, Abdulaziz; Jankovic, Boris R.; Chowdhary, Rajesh; Archer, John A.C.; Bajic, Vladimir B.

2013-01-01

initiation. Such information is not fully available, since not all proteins that act as TFs or TcoFs are yet annotated as such, due to generally partial functional annotation of proteins. In this study we have developed a method to predict, using only
DeepCNF-D: Predicting Protein Order/Disorder Regions by Weighted Deep Convolutional Neural Fields

Directory of Open Access Journals (Sweden)

Sheng Wang

2015-07-01

Full Text Available Intrinsically disordered proteins or protein regions are involved in key biological processes including regulation of transcription, signal transduction, and alternative splicing. Accurately predicting order/disorder regions ab initio from the protein sequence is a prerequisite step for further analysis of functions and mechanisms for these disordered regions. This work presents a learning method, weighted DeepCNF (Deep Convolutional Neural Fields, to improve the accuracy of order/disorder prediction by exploiting the long-range sequential information and the interdependency between adjacent order/disorder labels and by assigning different weights for each label during training and prediction to solve the label imbalance issue. Evaluated by the CASP9 and CASP10 targets, our method obtains 0.855 and 0.898 AUC values, which are higher than the state-of-the-art single ab initio predictors.
Silencing of Soybean Raffinose Synthase Gene Reduced Raffinose Family Oligosaccharides and Increased True Metabolizable Energy of Poultry Feed

Directory of Open Access Journals (Sweden)

Michelle F. Valentine

2017-05-01

Full Text Available Soybean [Glycine max (L. Merr.] is the number one oil and protein crop in the United States, but the seed contains several anti-nutritional factors that are toxic to both humans and livestock. RNA interference technology has become an increasingly popular technique in gene silencing because it allows for both temporal and spatial targeting of specific genes. The objective of this research is to use RNA-mediated gene silencing to down-regulate the soybean gene raffinose synthase 2 (RS2, to reduce total raffinose content in mature seed. Raffinose is a trisaccharide that is indigestible to humans and monogastric animals, and as monogastric animals are the largest consumers of soy products, reducing raffinose would improve the nutritional quality of soybean. An RNAi construct targeting RS2 was designed, cloned, and transformed to the soybean genome via Agrobacterium-mediated transformation. Resulting plants were analyzed for the presence and number of copies of the transgene by PCR and Southern blot. The efficiency of mRNA silencing was confirmed by real-time quantitative PCR. Total raffinose content was determined by HPLC analysis. Transgenic plant lines were recovered that exhibited dramatically reduced levels of raffinose in mature seed, and these lines were further analyzed for other phenotypes such as development and yield. Additionally, a precision-fed rooster assay was conducted to measure the true metabolizable energy (TME in full-fat soybean meal made from the wild-type or transgenic low-raffinose soybean lines. Transgenic low-raffinose soy had a measured TME of 2,703 kcal/kg, an increase as compared with 2,411 kcal/kg for wild-type. As low digestible energy is a major limiting factor in the percent of soybean meal that can be used in poultry diets, these results may substantiate the use of higher concentrations of low-raffinose, full-fat soy in formulated livestock diets.
Silencing of Soybean Raffinose Synthase Gene Reduced Raffinose Family Oligosaccharides and Increased True Metabolizable Energy of Poultry Feed

Science.gov (United States)

Valentine, Michelle F.; De Tar, Joann R.; Mookkan, Muruganantham; Firman, Jeffre D.; Zhang, Zhanyuan J.

2017-01-01

Soybean [Glycine max (L.) Merr.] is the number one oil and protein crop in the United States, but the seed contains several anti-nutritional factors that are toxic to both humans and livestock. RNA interference technology has become an increasingly popular technique in gene silencing because it allows for both temporal and spatial targeting of specific genes. The objective of this research is to use RNA-mediated gene silencing to down-regulate the soybean gene raffinose synthase 2 (RS2), to reduce total raffinose content in mature seed. Raffinose is a trisaccharide that is indigestible to humans and monogastric animals, and as monogastric animals are the largest consumers of soy products, reducing raffinose would improve the nutritional quality of soybean. An RNAi construct targeting RS2 was designed, cloned, and transformed to the soybean genome via Agrobacterium-mediated transformation. Resulting plants were analyzed for the presence and number of copies of the transgene by PCR and Southern blot. The efficiency of mRNA silencing was confirmed by real-time quantitative PCR. Total raffinose content was determined by HPLC analysis. Transgenic plant lines were recovered that exhibited dramatically reduced levels of raffinose in mature seed, and these lines were further analyzed for other phenotypes such as development and yield. Additionally, a precision-fed rooster assay was conducted to measure the true metabolizable energy (TME) in full-fat soybean meal made from the wild-type or transgenic low-raffinose soybean lines. Transgenic low-raffinose soy had a measured TME of 2,703 kcal/kg, an increase as compared with 2,411 kcal/kg for wild-type. As low digestible energy is a major limiting factor in the percent of soybean meal that can be used in poultry diets, these results may substantiate the use of higher concentrations of low-raffinose, full-fat soy in formulated livestock diets. PMID:28559898
Improving protein fold recognition and structural class prediction accuracies using physicochemical properties of amino acids.

Science.gov (United States)

Raicar, Gaurav; Saini, Harsh; Dehzangi, Abdollah; Lal, Sunil; Sharma, Alok

2016-08-07

Predicting the three-dimensional (3-D) structure of a protein is an important task in the field of bioinformatics and biological sciences. However, directly predicting the 3-D structure from the primary structure is hard to achieve. Therefore, predicting the fold or structural class of a protein sequence is generally used as an intermediate step in determining the protein's 3-D structure. For protein fold recognition (PFR) and structural class prediction (SCP), two steps are required - feature extraction step and classification step. Feature extraction techniques generally utilize syntactical-based information, evolutionary-based information and physicochemical-based information to extract features. In this study, we explore the importance of utilizing the physicochemical properties of amino acids for improving PFR and SCP accuracies. For this, we propose a Forward Consecutive Search (FCS) scheme which aims to strategically select physicochemical attributes that will supplement the existing feature extraction techniques for PFR and SCP. An exhaustive search is conducted on all the existing 544 physicochemical attributes using the proposed FCS scheme and a subset of physicochemical attributes is identified. Features extracted from these selected attributes are then combined with existing syntactical-based and evolutionary-based features, to show an improvement in the recognition and prediction performance on benchmark datasets. Copyright © 2016 Elsevier Ltd. All rights reserved.
Prediction of backbone dihedral angles and protein secondary structure using support vector machines

Directory of Open Access Journals (Sweden)

Hirst Jonathan D

2009-12-01

Full Text Available Abstract Background The prediction of the secondary structure of a protein is a critical step in the prediction of its tertiary structure and, potentially, its function. Moreover, the backbone dihedral angles, highly correlated with secondary structures, provide crucial information about the local three-dimensional structure. Results We predict independently both the secondary structure and the backbone dihedral angles and combine the results in a loop to enhance each prediction reciprocally. Support vector machines, a state-of-the-art supervised classification technique, achieve secondary structure predictive accuracy of 80% on a non-redundant set of 513 proteins, significantly higher than other methods on the same dataset. The dihedral angle space is divided into a number of regions using two unsupervised clustering techniques in order to predict the region in which a new residue belongs. The performance of our method is comparable to, and in some cases more accurate than, other multi-class dihedral prediction methods. Conclusions We have created an accurate predictor of backbone dihedral angles and secondary structure. Our method, called DISSPred, is available online at http://comp.chem.nottingham.ac.uk/disspred/.
Protein construct storage: Bayesian variable selection and prediction with mixtures.

Science.gov (United States)

Clyde, M A; Parmigiani, G

1998-07-01

Determining optimal conditions for protein storage while maintaining a high level of protein activity is an important question in pharmaceutical research. A designed experiment based on a space-filling design was conducted to understand the effects of factors affecting protein storage and to establish optimal storage conditions. Different model-selection strategies to identify important factors may lead to very different answers about optimal conditions. Uncertainty about which factors are important, or model uncertainty, can be a critical issue in decision-making. We use Bayesian variable selection methods for linear models to identify important variables in the protein storage data, while accounting for model uncertainty. We also use the Bayesian framework to build predictions based on a large family of models, rather than an individual model, and to evaluate the probability that certain candidate storage conditions are optimal.
PROCARB: A Database of Known and Modelled Carbohydrate-Binding Protein Structures with Sequence-Based Prediction Tools

Directory of Open Access Journals (Sweden)

Adeel Malik

2010-01-01

Full Text Available Understanding of the three-dimensional structures of proteins that interact with carbohydrates covalently (glycoproteins as well as noncovalently (protein-carbohydrate complexes is essential to many biological processes and plays a significant role in normal and disease-associated functions. It is important to have a central repository of knowledge available about these protein-carbohydrate complexes as well as preprocessed data of predicted structures. This can be significantly enhanced by tools de novo which can predict carbohydrate-binding sites for proteins in the absence of structure of experimentally known binding site. PROCARB is an open-access database comprising three independently working components, namely, (i Core PROCARB module, consisting of three-dimensional structures of protein-carbohydrate complexes taken from Protein Data Bank (PDB, (ii Homology Models module, consisting of manually developed three-dimensional models of N-linked and O-linked glycoproteins of unknown three-dimensional structure, and (iii CBS-Pred prediction module, consisting of web servers to predict carbohydrate-binding sites using single sequence or server-generated PSSM. Several precomputed structural and functional properties of complexes are also included in the database for quick analysis. In particular, information about function, secondary structure, solvent accessibility, hydrogen bonds and literature reference, and so forth, is included. In addition, each protein in the database is mapped to Uniprot, Pfam, PDB, and so forth.
Critical assessment of methods of protein structure prediction (CASP) - round x

KAUST Repository

Moult, John; Fidelis, Krzysztof; Kryshtafovych, Andriy; Schwede, Torsten; Tramontano, Anna

2013-01-01

This article is an introduction to the special issue of the journal PROTEINS, dedicated to the tenth Critical Assessment of Structure Prediction (CASP) experiment to assess the state of the art in protein structure modeling. The article describes the conduct of the experiment, the categories of prediction included, and outlines the evaluation and assessment procedures. The 10 CASP experiments span almost 20 years of progress in the field of protein structure modeling, and there have been enormous advances in methods and model accuracy in that period. Notable in this round is the first sustained improvement of models with refinement methods, using molecular dynamics. For the first time, we tested the ability of modeling methods to make use of sparse experimental three-dimensional contact information, such as may be obtained from new experimental techniques, with encouraging results. On the other hand, new contact prediction methods, though holding considerable promise, have yet to make an impact in CASP testing. The nature of CASP targets has been changing in recent CASPs, reflecting shifts in experimental structural biology, with more irregular structures, more multi-domain and multi-subunit structures, and less standard versions of known folds. When allowance is made for these factors, we continue to see steady progress in the overall accuracy of models, particularly resulting from improvement of non-template regions.
Critical assessment of methods of protein structure prediction (CASP) - round x

KAUST Repository

Moult, John

2013-12-17

This article is an introduction to the special issue of the journal PROTEINS, dedicated to the tenth Critical Assessment of Structure Prediction (CASP) experiment to assess the state of the art in protein structure modeling. The article describes the conduct of the experiment, the categories of prediction included, and outlines the evaluation and assessment procedures. The 10 CASP experiments span almost 20 years of progress in the field of protein structure modeling, and there have been enormous advances in methods and model accuracy in that period. Notable in this round is the first sustained improvement of models with refinement methods, using molecular dynamics. For the first time, we tested the ability of modeling methods to make use of sparse experimental three-dimensional contact information, such as may be obtained from new experimental techniques, with encouraging results. On the other hand, new contact prediction methods, though holding considerable promise, have yet to make an impact in CASP testing. The nature of CASP targets has been changing in recent CASPs, reflecting shifts in experimental structural biology, with more irregular structures, more multi-domain and multi-subunit structures, and less standard versions of known folds. When allowance is made for these factors, we continue to see steady progress in the overall accuracy of models, particularly resulting from improvement of non-template regions.
DNABP: Identification of DNA-Binding Proteins Based on Feature Selection Using a Random Forest and Predicting Binding Residues.

Science.gov (United States)

Ma, Xin; Guo, Jing; Sun, Xiao

2016-01-01

DNA-binding proteins are fundamentally important in cellular processes. Several computational-based methods have been developed to improve the prediction of DNA-binding proteins in previous years. However, insufficient work has been done on the prediction of DNA-binding proteins from protein sequence information. In this paper, a novel predictor, DNABP (DNA-binding proteins), was designed to predict DNA-binding proteins using the random forest (RF) classifier with a hybrid feature. The hybrid feature contains two types of novel sequence features, which reflect information about the conservation of physicochemical properties of the amino acids, and the binding propensity of DNA-binding residues and non-binding propensities of non-binding residues. The comparisons with each feature demonstrated that these two novel features contributed most to the improvement in predictive ability. Furthermore, to improve the prediction performance of the DNABP model, feature selection using the minimum redundancy maximum relevance (mRMR) method combined with incremental feature selection (IFS) was carried out during the model construction. The results showed that the DNABP model could achieve 86.90% accuracy, 83.76% sensitivity, 90.03% specificity and a Matthews correlation coefficient of 0.727. High prediction accuracy and performance comparisons with previous research suggested that DNABP could be a useful approach to identify DNA-binding proteins from sequence information. The DNABP web server system is freely available at http://www.cbi.seu.edu.cn/DNABP/.

Intrinsic disorder in Viral Proteins Genome-Linked: experimental and predictive analyses

Directory of Open Access Journals (Sweden)

Van Dorsselaer Alain

2009-02-01

Full Text Available Abstract Background VPgs are viral proteins linked to the 5' end of some viral genomes. Interactions between several VPgs and eukaryotic translation initiation factors eIF4Es are critical for plant infection. However, VPgs are not restricted to phytoviruses, being also involved in genome replication and protein translation of several animal viruses. To date, structural data are still limited to small picornaviral VPgs. Recently three phytoviral VPgs were shown to be natively unfolded proteins. Results In this paper, we report the bacterial expression, purification and biochemical characterization of two phytoviral VPgs, namely the VPgs of Rice yellow mottle virus (RYMV, genus Sobemovirus and Lettuce mosaic virus (LMV, genus Potyvirus. Using far-UV circular dichroism and size exclusion chromatography, we show that RYMV and LMV VPgs are predominantly or partly unstructured in solution, respectively. Using several disorder predictors, we show that both proteins are predicted to possess disordered regions. We next extend theses results to 14 VPgs representative of the viral diversity. Disordered regions were predicted in all VPg sequences whatever the genus and the family. Conclusion Based on these results, we propose that intrinsic disorder is a common feature of VPgs. The functional role of intrinsic disorder is discussed in light of the biological roles of VPgs.
Simplified Method for Predicting a Functional Class of Proteins in Transcription Factor Complexes

KAUST Repository

Piatek, Marek J.

2013-07-12

Background:Initiation of transcription is essential for most of the cellular responses to environmental conditions and for cell and tissue specificity. This process is regulated through numerous proteins, their ligands and mutual interactions, as well as interactions with DNA. The key such regulatory proteins are transcription factors (TFs) and transcription co-factors (TcoFs). TcoFs are important since they modulate the transcription initiation process through interaction with TFs. In eukaryotes, transcription requires that TFs form different protein complexes with various nuclear proteins. To better understand transcription regulation, it is important to know the functional class of proteins interacting with TFs during transcription initiation. Such information is not fully available, since not all proteins that act as TFs or TcoFs are yet annotated as such, due to generally partial functional annotation of proteins. In this study we have developed a method to predict, using only sequence composition of the interacting proteins, the functional class of human TF binding partners to be (i) TF, (ii) TcoF, or (iii) other nuclear protein. This allows for complementing the annotation of the currently known pool of nuclear proteins. Since only the knowledge of protein sequences is required in addition to protein interaction, the method should be easily applicable to many species.Results:Based on experimentally validated interactions between human TFs with different TFs, TcoFs and other nuclear proteins, our two classification systems (implemented as a web-based application) achieve high accuracies in distinguishing TFs and TcoFs from other nuclear proteins, and TFs from TcoFs respectively.Conclusion:As demonstrated, given the fact that two proteins are capable of forming direct physical interactions and using only information about their sequence composition, we have developed a completely new method for predicting a functional class of TF interacting protein partners
Expected packing density allows prediction of both amyloidogenic and disordered regions in protein chains

Energy Technology Data Exchange (ETDEWEB)

Galzitskaya, Oxana V; Garbuzynskiy, Sergiy O; Lobanov, Michail Yu [Institute of Protein Research, Russian Academy of Sciences, 142290, Pushchino, Moscow Region (Russian Federation)

2007-07-18

The determination of factors that influence conformational changes in proteins is very important for the identification of potentially amyloidogenic and disordered regions in polypeptide chains. In our work we introduce a new parameter, mean packing density, to detect both amyloidogenic and disordered regions in a protein sequence. It has been shown that regions with strong expected packing density are responsible for amyloid formation. Our predictions are consistent with known disease-related amyloidogenic regions for 9 of 12 amyloid-forming proteins and peptides in which the positions of amyloidogenic regions have been revealed experimentally. Our findings support the concept that the mechanism of formation of amyloid fibrils is similar for different peptides and proteins. Moreover, we have demonstrated that regions with weak expected packing density are responsible for the appearance of disordered regions. Our method has been tested on datasets of globular proteins and long disordered protein segments, and it shows improved performance over other widely used methods. Thus, we demonstrate that the expected packing density is a useful value for predicting both disordered and amyloidogenic regions of a protein based on sequence alone. Our results are important for understanding the structural characteristics of protein folding and misfolding.
Comparative Genomics and Disorder Prediction Identify Biologically Relevant SH3 Protein Interactions.

Directory of Open Access Journals (Sweden)

2005-08-01

Full Text Available Protein interaction networks are an important part of the post-genomic effort to integrate a part-list view of the cell into system-level understanding. Using a set of 11 yeast genomes we show that combining comparative genomics and secondary structure information greatly increases consensus-based prediction of SH3 targets. Benchmarking of our method against positive and negative standards gave 83% accuracy with 26% coverage. The concept of an optimal divergence time for effective comparative genomics studies was analyzed, demonstrating that genomes of species that diverged very recently from Saccharomyces cerevisiae(S. mikatae, S. bayanus, and S. paradoxus, or a long time ago (Neurospora crassa and Schizosaccharomyces pombe, contain less information for accurate prediction of SH3 targets than species within the optimal divergence time proposed. We also show here that intrinsically disordered SH3 domain targets are more probable sites of interaction than equivalent sites within ordered regions. Our findings highlight several novel S. cerevisiae SH3 protein interactions, the value of selection of optimal divergence times in comparative genomics studies, and the importance of intrinsic disorder for protein interactions. Based on our results we propose novel roles for the S. cerevisiae proteins Abp1p in endocytosis and Hse1p in endosome protein sorting.
Comparative genomics and disorder prediction identify biologically relevant SH3 protein interactions.

Directory of Open Access Journals (Sweden)

Pedro Beltrao

2005-08-01

Full Text Available Protein interaction networks are an important part of the post-genomic effort to integrate a part-list view of the cell into system-level understanding. Using a set of 11 yeast genomes we show that combining comparative genomics and secondary structure information greatly increases consensus-based prediction of SH3 targets. Benchmarking of our method against positive and negative standards gave 83% accuracy with 26% coverage. The concept of an optimal divergence time for effective comparative genomics studies was analyzed, demonstrating that genomes of species that diverged very recently from Saccharomyces cerevisiae(S. mikatae, S. bayanus, and S. paradoxus, or a long time ago (Neurospora crassa and Schizosaccharomyces pombe, contain less information for accurate prediction of SH3 targets than species within the optimal divergence time proposed. We also show here that intrinsically disordered SH3 domain targets are more probable sites of interaction than equivalent sites within ordered regions. Our findings highlight several novel S. cerevisiae SH3 protein interactions, the value of selection of optimal divergence times in comparative genomics studies, and the importance of intrinsic disorder for protein interactions. Based on our results we propose novel roles for the S. cerevisiae proteins Abp1p in endocytosis and Hse1p in endosome protein sorting.
Modeling heterogeneous (co)variances from adjacent-SNP groups improves genomic prediction for milk protein composition traits

DEFF Research Database (Denmark)

Gebreyesus, Grum; Lund, Mogens Sandø; Buitenhuis, Albert Johannes

2017-01-01

Accurate genomic prediction requires a large reference population, which is problematic for traits that are expensive to measure. Traits related to milk protein composition are not routinely recorded due to costly procedures and are considered to be controlled by a few quantitative trait loci...... of large effect. The amount of variation explained may vary between regions leading to heterogeneous (co)variance patterns across the genome. Genomic prediction models that can efficiently take such heterogeneity of (co)variances into account can result in improved prediction reliability. In this study, we...... developed and implemented novel univariate and bivariate Bayesian prediction models, based on estimates of heterogeneous (co)variances for genome segments (BayesAS). Available data consisted of milk protein composition traits measured on cows and de-regressed proofs of total protein yield derived for bulls...
Prediction of mutational tolerance in HIV-1 protease and reverse transcriptase using flexible backbone protein design.

Directory of Open Access Journals (Sweden)

Elisabeth Humphris-Narayanan

Full Text Available Predicting which mutations proteins tolerate while maintaining their structure and function has important applications for modeling fundamental properties of proteins and their evolution; it also drives progress in protein design. Here we develop a computational model to predict the tolerated sequence space of HIV-1 protease reachable by single mutations. We assess the model by comparison to the observed variability in more than 50,000 HIV-1 protease sequences, one of the most comprehensive datasets on tolerated sequence space. We then extend the model to a second protein, reverse transcriptase. The model integrates multiple structural and functional constraints acting on a protein and uses ensembles of protein conformations. We find the model correctly captures a considerable fraction of protease and reverse-transcriptase mutational tolerance and shows comparable accuracy using either experimentally determined or computationally generated structural ensembles. Predictions of tolerated sequence space afforded by the model provide insights into stability-function tradeoffs in the emergence of resistance mutations and into strengths and limitations of the computational model.
PredMP: A Web Resource for Computationally Predicted Membrane Proteins via Deep Learning

KAUST Repository

Wang, Sheng

2018-02-06

Experimental determination of membrane protein (MP) structures is challenging as they are often too large for nuclear magnetic resonance (NMR) experiments and difficult to crystallize. Currently there are only about 510 non-redundant MPs with solved structures in Protein Data Bank (PDB). To elucidate the MP structures computationally, we developed a novel web resource, denoted as PredMP (http://52.87.130.56:3001/#/proteinindex), that delivers one-dimensional (1D) annotation of the membrane topology and secondary structure, two-dimensional (2D) prediction of the contact/distance map, together with three-dimensional (3D) modeling of the MP structure in the lipid bilayer, for each MP target from a given model organism. The precision of the computationally constructed MP structures is leveraged by state-of-the-art deep learning methods as well as cutting-edge modeling strategies. In particular, (i) we annotate 1D property via DeepCNF (Deep Convolutional Neural Fields) that not only models complex sequence-structure relationship but also interdependency between adjacent property labels; (ii) we predict 2D contact/distance map through Deep Transfer Learning which learns the patterns as well as the complex relationship between contacts/distances and protein features from non-membrane proteins; and (iii) we model 3D structure by feeding its predicted contacts and secondary structure to the Crystallography & NMR System (CNS) suite combined with a membrane burial potential that is residue-specific and depth-dependent. PredMP currently contains more than 2,200 multi-pass transmembrane proteins (length<700 residues) from Human. These transmembrane proteins are classified according to IUPHAR/BPS Guide, which provides a hierarchical organization of receptors, channels, transporters, enzymes and other drug targets according to their molecular relationships and physiological functions. Among these MPs, we estimated that our approach could predict correct folds for 1
Proteochemometric model for predicting the inhibition of penicillin-binding proteins

Science.gov (United States)

Nabu, Sunanta; Nantasenamat, Chanin; Owasirikul, Wiwat; Lawung, Ratana; Isarankura-Na-Ayudhya, Chartchalerm; Lapins, Maris; Wikberg, Jarl E. S.; Prachayasittikul, Virapong

2015-02-01

Neisseria gonorrhoeae infection threatens to become an untreatable sexually transmitted disease in the near future owing to the increasing emergence of N. gonorrhoeae strains with reduced susceptibility and resistance to the extended-spectrum cephalosporins (ESCs), i.e. ceftriaxone and cefixime, which are the last remaining option for first-line treatment of gonorrhea. Alteration of the penA gene, encoding penicillin-binding protein 2 (PBP2), is the main mechanism conferring penicillin resistance including reduced susceptibility and resistance to ESCs. To predict and investigate putative amino acid mutations causing β-lactam resistance particularly for ESCs, we applied proteochemometric modeling to generalize N. gonorrhoeae susceptibility data for predicting the interaction of PBP2 with therapeutic β-lactam antibiotics. This was afforded by correlating publicly available data on antimicrobial susceptibility of wild-type and mutant N. gonorrhoeae strains for penicillin-G, cefixime and ceftriaxone with 50 PBP2 protein sequence data using partial least-squares projections to latent structures. The generated model revealed excellent predictability ( R 2 = 0.91, Q 2 = 0.77, Q Ext 2 = 0.78). Moreover, our model identified amino acid mutations in PBP2 with the highest impact on antimicrobial susceptibility and provided information on physicochemical properties of amino acid mutations affecting antimicrobial susceptibility. Our model thus provided insight into the physicochemical basis for resistance development in PBP2 suggesting its use for predicting and monitoring novel PBP2 mutations that may emerge in the future.
Presep: predicting the propensity of a protein being secreted into the supernatant when expressed in Pichia pastoris.

Directory of Open Access Journals (Sweden)

Jian Tian

Full Text Available Pichia pastoris is commonly used for the production of recombinant proteins due to its preferential secretion of recombinant proteins, resulting in lower production costs and increased yields of target proteins. However, not all recombinant proteins can be successfully secreted in P. pastoris. A computational method that predicts the likelihood of a protein being secreted into the supernatant would be of considerable value; however, to the best of our knowledge, no such tool has yet been developed. We present a machine-learning approach called Presep to assess the likelihood of a recombinant protein being secreted by P. pastoris based on its pseudo amino acid composition (PseAA. Using a 20-fold cross validation, Presep demonstrated a high degree of accuracy, with Matthews correlation coefficient (MCC and overall accuracy (Q2 scores of 0.78 and 95%, respectively. Computational results were validated experimentally, with six β-galactosidase genes expressed in P. pastoris strain GS115 to verify Presep model predictions. A strong correlation (R(2 = 0.967 was observed between Presep prediction secretion propensity and the experimental secretion percentage. Together, these results demonstrate the ability of the Presep model for predicting the secretion propensity of P. pastoris for a given protein. This model may serve as a valuable tool for determining the utility of P. pastoris as a host organism prior to initiating biological experiments. The Presep prediction tool can be freely downloaded at http://www.mobioinfor.cn/Presep.
Predicting Structure and Function for Novel Proteins of an Extremophilic Iron Oxidizing Bacterium

Science.gov (United States)

Wheeler, K.; Zemla, A.; Banfield, J.; Thelen, M.

2007-12-01

Proteins isolated from uncultivated microbial populations represent the functional components of microbial processes and contribute directly to community fitness under natural conditions. Investigations into proteins in the environment are hindered by the lack of genome data, or where available, the high proportion of proteins of unknown function. We have identified thousands of proteins from biofilms in the extremely acidic drainage outflow of an iron mine ecosystem (1). With an extensive genomic and proteomic foundation, we have focused directly on the problem of several hundred proteins of unknown function within this well-defined model system. Here we describe the geobiological insights gained by using a high throughput computational approach for predicting structure and function of 421 novel proteins from the biofilm community. We used a homology based modeling system to compare these proteins to those of known structure (AS2TS) (2). This approach has resulted in the assignment of structures to 360 proteins (85%) and provided functional information for up to 75% of the modeled proteins. Detailed examination of the modeling results enables confident, high-throughput prediction of the roles of many of the novel proteins within the microbial community. For instance, one prediction places a protein in the phosphoenolpyruvate/pyruvate domain superfamily as a carboxylase that fills in a gap in an otherwise complete carbon cycle. Particularly important for a community in such a metal rich environment is the evolution of over 25% of the novel proteins that contain a metal cofactor; of these, one third are likely Fe containing proteins. Two of the most abundant proteins in biofilm samples are unusual c-type cytochromes. Both of these proteins catalyze iron- oxidation, a key metabolic reaction supporting the energy requirements of this community. Structural models of these cytochromes verify our experimental results on heme binding and electron transfer reactivity, and
Prediction of homoprotein and heteroprotein complexes by protein docking and template-based modeling: A CASP-CAPRI experiment

KAUST Repository

Lensink, Marc F.; Velankar, Sameer; Kryshtafovych, Andriy; Huang, Shen-You; Schneidman-Duhovny, Dina; Sali, Andrej; Segura, Joan; Fernandez-Fuentes, Narcis; Viswanath, Shruthi; Elber, Ron; Grudinin, Sergei; Popov, Petr; Neveu, Emilie; Lee, Hasup; Baek, Minkyung; Park, Sangwoo; Heo, Lim; Rie Lee, Gyu; Seok, Chaok; Qin, Sanbo; Zhou, Huan-Xiang; Ritchie, David W.; Maigret, Bernard; Devignes, Marie-Dominique; Ghoorah, Anisah; Torchala, Mieczyslaw; Chaleil, Raphaë l A.G.; Bates, Paul A.; Ben-Zeev, Efrat; Eisenstein, Miriam; Negi, Surendra S.; Weng, Zhiping; Vreven, Thom; Pierce, Brian G.; Borrman, Tyler M.; Yu, Jinchao; Ochsenbein, Franç oise; Guerois, Raphaë l; Vangone, Anna; Rodrigues, Joã o P.G.L.M.; van Zundert, Gydo; Nellen, Mehdi; Xue, Li; Karaca, Ezgi; Melquiond, Adrien S.J.; Visscher, Koen; Kastritis, Panagiotis L.; Bonvin, Alexandre M.J.J.; Xu, Xianjin; Qiu, Liming; Yan, Chengfei; Li, Jilong; Ma, Zhiwei; Cheng, Jianlin; Zou, Xiaoqin; Shen, Yang; Peterson, Lenna X.; Kim, Hyung-Rae; Roy, Amit; Han, Xusi; Esquivel-Rodriguez, Juan; Kihara, Daisuke; Yu, Xiaofeng; Bruce, Neil J.; Fuller, Jonathan C.; Wade, Rebecca C.; Anishchenko, Ivan; Kundrotas, Petras J.; Vakser, Ilya A.; Imai, Kenichiro; Yamada, Kazunori; Oda, Toshiyuki; Nakamura, Tsukasa; Tomii, Kentaro; Pallara, Chiara; Romero-Durana, Miguel; Jimé nez-Garcí a, Brian; Moal, Iain H.; Fé rnandez-Recio, Juan; Joung, Jong Young; Kim, Jong Yun; Joo, Keehyoung; Lee, Jooyoung; Kozakov, Dima; Vajda, Sandor; Mottarella, Scott; Hall, David R.; Beglov, Dmitri; Mamonov, Artem; Xia, Bing; Bohnuud, Tanggis; Del Carpio, Carlos A.; Ichiishi, Eichiro; Marze, Nicholas; Kuroda, Daisuke; Roy Burman, Shourya S.; Gray, Jeffrey J.; Chermak, Edrisse; Cavallo, Luigi; Oliva, Romina; Tovchigrechko, Andrey; Wodak, Shoshana J.

2016-01-01

We present the results for CAPRI Round 30, the first joint CASP-CAPRI experiment, which brought together experts from the protein structure prediction and protein-protein docking communities. The Round comprised 25 targets from amongst those submitted for the CASP11 prediction experiment of 2014. The targets included mostly homodimers, a few homotetramers, and two heterodimers, and comprised protein chains that could readily be modeled using templates from the Protein Data Bank. On average 24 CAPRI groups and 7 CASP groups submitted docking predictions for each target, and 12 CAPRI groups per target participated in the CAPRI scoring experiment. In total more than 9500 models were assessed against the 3D structures of the corresponding target complexes. Results show that the prediction of homodimer assemblies by homology modeling techniques and docking calculations is quite successful for targets featuring large enough subunit interfaces to represent stable associations. Targets with ambiguous or inaccurate oligomeric state assignments, often featuring crystal contact-sized interfaces, represented a confounding factor. For those, a much poorer prediction performance was achieved, while nonetheless often providing helpful clues on the correct oligomeric state of the protein. The prediction performance was very poor for genuine tetrameric targets, where the inaccuracy of the homology-built subunit models and the smaller pair-wise interfaces severely limited the ability to derive the correct assembly mode. Our analysis also shows that docking procedures tend to perform better than standard homology modeling techniques and that highly accurate models of the protein components are not always required to identify their association modes with acceptable accuracy. © 2016 Wiley Periodicals, Inc.
Prediction of homoprotein and heteroprotein complexes by protein docking and template-based modeling: A CASP-CAPRI experiment

KAUST Repository

Lensink, Marc F.

2016-04-28

We present the results for CAPRI Round 30, the first joint CASP-CAPRI experiment, which brought together experts from the protein structure prediction and protein-protein docking communities. The Round comprised 25 targets from amongst those submitted for the CASP11 prediction experiment of 2014. The targets included mostly homodimers, a few homotetramers, and two heterodimers, and comprised protein chains that could readily be modeled using templates from the Protein Data Bank. On average 24 CAPRI groups and 7 CASP groups submitted docking predictions for each target, and 12 CAPRI groups per target participated in the CAPRI scoring experiment. In total more than 9500 models were assessed against the 3D structures of the corresponding target complexes. Results show that the prediction of homodimer assemblies by homology modeling techniques and docking calculations is quite successful for targets featuring large enough subunit interfaces to represent stable associations. Targets with ambiguous or inaccurate oligomeric state assignments, often featuring crystal contact-sized interfaces, represented a confounding factor. For those, a much poorer prediction performance was achieved, while nonetheless often providing helpful clues on the correct oligomeric state of the protein. The prediction performance was very poor for genuine tetrameric targets, where the inaccuracy of the homology-built subunit models and the smaller pair-wise interfaces severely limited the ability to derive the correct assembly mode. Our analysis also shows that docking procedures tend to perform better than standard homology modeling techniques and that highly accurate models of the protein components are not always required to identify their association modes with acceptable accuracy. © 2016 Wiley Periodicals, Inc.
COMSAT: Residue contact prediction of transmembrane proteins based on support vector machines and mixed integer linear programming.

Science.gov (United States)

Zhang, Huiling; Huang, Qingsheng; Bei, Zhendong; Wei, Yanjie; Floudas, Christodoulos A

2016-03-01

In this article, we present COMSAT, a hybrid framework for residue contact prediction of transmembrane (TM) proteins, integrating a support vector machine (SVM) method and a mixed integer linear programming (MILP) method. COMSAT consists of two modules: COMSAT_SVM which is trained mainly on position-specific scoring matrix features, and COMSAT_MILP which is an ab initio method based on optimization models. Contacts predicted by the SVM model are ranked by SVM confidence scores, and a threshold is trained to improve the reliability of the predicted contacts. For TM proteins with no contacts above the threshold, COMSAT_MILP is used. The proposed hybrid contact prediction scheme was tested on two independent TM protein sets based on the contact definition of 14 Å between Cα-Cα atoms. First, using a rigorous leave-one-protein-out cross validation on the training set of 90 TM proteins, an accuracy of 66.8%, a coverage of 12.3%, a specificity of 99.3% and a Matthews' correlation coefficient (MCC) of 0.184 were obtained for residue pairs that are at least six amino acids apart. Second, when tested on a test set of 87 TM proteins, the proposed method showed a prediction accuracy of 64.5%, a coverage of 5.3%, a specificity of 99.4% and a MCC of 0.106. COMSAT shows satisfactory results when compared with 12 other state-of-the-art predictors, and is more robust in terms of prediction accuracy as the length and complexity of TM protein increase. COMSAT is freely accessible at http://hpcc.siat.ac.cn/COMSAT/. © 2016 Wiley Periodicals, Inc.
Bayesian Inference using Neural Net Likelihood Models for Protein Secondary Structure Prediction

Directory of Open Access Journals (Sweden)

Seong-Gon Kim

2011-06-01

Full Text Available Several techniques such as Neural Networks, Genetic Algorithms, Decision Trees and other statistical or heuristic methods have been used to approach the complex non-linear task of predicting Alpha-helicies, Beta-sheets and Turns of a proteins secondary structure in the past. This project introduces a new machine learning method by using an offline trained Multilayered Perceptrons (MLP as the likelihood models within a Bayesian Inference framework to predict secondary structures proteins. Varying window sizes are used to extract neighboring amino acid information and passed back and forth between the Neural Net models and the Bayesian Inference process until there is a convergence of the posterior secondary structure probability.
Effects of protein restriction in utero on the metabolism of mink dams (Neovison vison) and on mink kit survival as well as on postnatal growth

DEFF Research Database (Denmark)

Vesterdorf, Kristine Høvelt; Harrison, Adrian Paul; Matthiesen, Connie Marianne Frank

2012-01-01

be determined. Mink dams were fed an adequate protein (AP; crude protein:fat:carbo- hydrate ratio of 31:55:14% of metabolizable energy, ME) or a low protein diet (LP; 19%:49%: 32% of ME) during the last 21.2 ± 3.3 days of gestation, followed by an adequate diet during lactation. Respiration and balance...... experiments were performed during late gestation and twice during lactation. The dietary treatment only affected energy metabolism traits significantly during the treatment period in late gestation, such that LP dams oxidized less protein (12% vs 23% of heat production, HE, P = 0.001) but more carbohydrate...... (37% vs 26% of HE, P vs 0.4 g.kg-0.75.day-1, P vs 1.4, P vs 11.6 g...
Yeast prions and human prion-like proteins: sequence features and prediction methods.

Science.gov (United States)

Cascarina, Sean M; Ross, Eric D

2014-06-01

Prions are self-propagating infectious protein isoforms. A growing number of prions have been identified in yeast, each resulting from the conversion of soluble proteins into an insoluble amyloid form. These yeast prions have served as a powerful model system for studying the causes and consequences of prion aggregation. Remarkably, a number of human proteins containing prion-like domains, defined as domains with compositional similarity to yeast prion domains, have recently been linked to various human degenerative diseases, including amyotrophic lateral sclerosis. This suggests that the lessons learned from yeast prions may help in understanding these human diseases. In this review, we examine what has been learned about the amino acid sequence basis for prion aggregation in yeast, and how this information has been used to develop methods to predict aggregation propensity. We then discuss how this information is being applied to understand human disease, and the challenges involved in applying yeast prediction methods to higher organisms.
The Ising model for prediction of disordered residues from protein sequence alone

International Nuclear Information System (INIS)

Lobanov, Michail Yu; Galzitskaya, Oxana V

2011-01-01

Intrinsically disordered regions serve as molecular recognition elements, which play an important role in the control of many cellular processes and signaling pathways. It is useful to be able to predict positions of disordered residues and disordered regions in protein chains using protein sequence alone. A new method (IsUnstruct) based on the Ising model for prediction of disordered residues from protein sequence alone has been developed. According to this model, each residue can be in one of two states: ordered or disordered. The model is an approximation of the Ising model in which the interaction term between neighbors has been replaced by a penalty for changing between states (the energy of border). The IsUnstruct has been compared with other available methods and found to perform well. The method correctly finds 77% of disordered residues as well as 87% of ordered residues in the CASP8 database, and 72% of disordered residues as well as 85% of ordered residues in the DisProt database
Predicting the activity coefficients of free-solvent for concentrated globular protein solutions using independently determined physical parameters.

Directory of Open Access Journals (Sweden)

Devin W McBride

Full Text Available The activity coefficient is largely considered an empirical parameter that was traditionally introduced to correct the non-ideality observed in thermodynamic systems such as osmotic pressure. Here, the activity coefficient of free-solvent is related to physically realistic parameters and a mathematical expression is developed to directly predict the activity coefficients of free-solvent, for aqueous protein solutions up to near-saturation concentrations. The model is based on the free-solvent model, which has previously been shown to provide excellent prediction of the osmotic pressure of concentrated and crowded globular proteins in aqueous solutions up to near-saturation concentrations. Thus, this model uses only the independently determined, physically realizable quantities: mole fraction, solvent accessible surface area, and ion binding, in its prediction. Predictions are presented for the activity coefficients of free-solvent for near-saturated protein solutions containing either bovine serum albumin or hemoglobin. As a verification step, the predictability of the model for the activity coefficient of sucrose solutions was evaluated. The predicted activity coefficients of free-solvent are compared to the calculated activity coefficients of free-solvent based on osmotic pressure data. It is observed that the predicted activity coefficients are increasingly dependent on the solute-solvent parameters as the protein concentration increases to near-saturation concentrations.
A discriminatory function for prediction of protein-DNA interactions based on alpha shape modeling.

Science.gov (United States)

Zhou, Weiqiang; Yan, Hong

2010-10-15

Protein-DNA interaction has significant importance in many biological processes. However, the underlying principle of the molecular recognition process is still largely unknown. As more high-resolution 3D structures of protein-DNA complex are becoming available, the surface characteristics of the complex become an important research topic. In our work, we apply an alpha shape model to represent the surface structure of the protein-DNA complex and developed an interface-atom curvature-dependent conditional probability discriminatory function for the prediction of protein-DNA interaction. The interface-atom curvature-dependent formalism captures atomic interaction details better than the atomic distance-based method. The proposed method provides good performance in discriminating the native structures from the docking decoy sets, and outperforms the distance-dependent formalism in terms of the z-score. Computer experiment results show that the curvature-dependent formalism with the optimal parameters can achieve a native z-score of -8.17 in discriminating the native structure from the highest surface-complementarity scored decoy set and a native z-score of -7.38 in discriminating the native structure from the lowest RMSD decoy set. The interface-atom curvature-dependent formalism can also be used to predict apo version of DNA-binding proteins. These results suggest that the interface-atom curvature-dependent formalism has a good prediction capability for protein-DNA interactions. The code and data sets are available for download on http://www.hy8.com/bioinformatics.htm kenandzhou@hotmail.com.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.