Precision is calculated while the percentage of correct predictions on the dataset size. == Execution == The Soda pop web server is implemented using the others (Representational Condition Transfer) architecture, allowing access from a web-based interface aswell as programmatically Chlorpromazine hydrochloride through external APIs or alternative party web services exploiting the Node.js features. critical part in proteins homeostasis (1,2). It still continues to be a major concern in the complete structural and practical characterization of several protein and isolated domains (36). Insoluble areas in proteins have a tendency to aggregate (2), resulting in a number of diseases such as for example Alzheimer’s (7) and amyloidoses (8). Aggregation like a flip part of low proteins solubility represents a biotechnological problem also. Soluble manifestation remains a significant bottleneck in proteins creation (9) and low solubility in medicines could make them inadequate (10) and even poisonous (11). Targeted mutagenesis, without influencing proteins framework or function generally, has been proven in several cases to be always a beneficial tool to improve proteins solubility (4). In the lack of structural understanding Specifically, the recognition of residues to mutagenize benefits from dedicated prediction methods. In addition, predictors can contribute to the identification of pathogenic mutations in solubility-related diseases (12,13). A particularly challenging class of proteins are antibodies, which are widely used for pharmaceutical applications (14). Some regions in these molecules can be poorly soluble and the reason for that is encoded in their function, as these regions are designed to capture proteins with high affinity. The binding affinity of a protein and more generally the tendency to aggregation have been inversely Rabbit Polyclonal to STEA2 correlated to its solubility (15). The two concepts are defined by similar properties of the amino acid sequence. To optimize antibody solubility without affecting binding propensity, a number of experimental approaches have been developed. For example, in phage display and heat denaturation (16), a great variety of variants can be produced and tested. Computational methods to pre-emptively screen variants in antibodies and allow protein design would considerably reduce cost and time in this process. Some computational methods have already been developed to measure solubility of proteins for this reason (1722). The majority of methods is targeted to quantify the solubility of a wild-type protein for heterologous protein over-expression, while only few are specifically designed to evaluate the effects of variants on the solubility of the molecule (18,21,22). The identification and tuning of sequence determinants for protein aggregation has been used as a valuable tool to regulate protein solubility (23). Among the determinants of protein aggregation, intrinsic disorder has also been shown to play a major part (24). The highly dynamical disordered regions of a protein can increase its propensity to aggregate under different conditions. Both aggregation and intrinsic disorder propensity are influenced by the physico-chemical properties of each amino acid in the sequence, such as hydrophobicity, secondary structure propensity and charge (25). Here, we describe SODA, a new method to predict the Chlorpromazine hydrochloride effects of sequence variations on protein solubility. SODA exploits the concepts described above (aggregation and disorder propensity, hydrophobic profile, predicted secondary structure components) to characterize a wild type sequence with its intrinsic solubility profile. It was benchmarked on two datasets and compared to other published predictors. SODA is designed to allow prediction for all possible sequence variations, including insertions and deletions. In addition, the web server has two different operating modes, allowing the user to either target mutations or evaluate the effect of all possible substitutions on the input sequence. The case of an antibody, evaluating effects of mutations on its surface is used to discuss a novel full protein mode. == METHODS == SODA predicts solubility changes introduced by a mutation by comparing the profiles of the wild type (WT) and mutated sequences. The PASTA (26) aggregation propensity and ESpritz (27) intrinsic disorder scores are combined with a Kyte-Doolittle hydrophobicity profile (28) and secondary structure propensities for -helix and -strand estimated with FESS (29). SODA is able to evaluate difficult types of variation including point mutations, deletions and insertions. The predictor is based on sequence features and allows the large-scale screening of protein mutations. When available, a protein structure can be used to improve the prediction by masking buried residues from the solubility prediction. == Algorithm == SODA prediction is based on five individual component scores (calculated with Chlorpromazine hydrochloride default parameters): PASTA aggregation energy with 90% cut-off specificity (26), ESpritz disorder propensity in X-ray prediction mode (27), the negative KyteDoolittle hydrophobicity profile (28) and the two secondary structure propensities for -helix and -strand calculated with FESS (29). Each score difference Sis summed and normalized for the full sequence.
Precision is calculated while the percentage of correct predictions on the dataset size