Bioinformatics Analysis of Serologic Proteins of Prostate Cancer Patients Separated by SDS-PAGE

One of the main goals of bioinformatics is to understand and analyze the 3D structure of proteins and the relationship between amino acid sequences. With the help of amino acid sequences, the protein structure can easily be predicted as proteins are essential in natural science research and they are linked with evolution, drug development, mutation and the occurrence of different diseases directly or indirectly. Biologists used bioinformatics tools to discover different diseases by knowing protein’s structure and functions rather than using different technologies/experimental tools which can’t completely explains proteins, its structure and role in several diseases. Prostate Cancer is the leading cause of cancer deaths in males worldwide, it’s least common in Asia and more common in western countries. The study was conducted for the bioinformatics analysis of Prostate cancer proteins. ARTICLE INFORMATION Received: 12.08.2018 Revised: 01.11.2018 Accepted: 26.12.2018 DOI: 10.31580/pjmls.v1i1.940


INTRODUCTION
Bioinformatics is the multidisciplinary field of science that combines different fields i.e., computer science, mathematics, information technology, statistics and biology to study and understand all the biological data and deals with computational management of molecular biological material. As bioinformatics provides a huge data related to molecular biology, so currently many bioinformatics projects are dealing with functional and structural features of proteins and genes (1). The two main drawbacks of experimental tools are cost and time, therefore biologists use bioinformatics tools as an alternative to practice their research studies on protein functions and their structures (2). Prostate cancer (PCa) is the common cancers in men and the second most common cause of men mortality and morbidity. Age, heredity and race are known to be risk factors for prostate cancer. Furthermore, some exogenous factors, such as a reduced intake of vitamin E and high-fat diet, selenium, isoflavones or lignans, may stimulate the progression from the dormant prostate cancer type to the clinical type (3,4). Proteomics and bioinformatics technology development increases the study of prostate cancer, considering a commanding tool in order to study the metastasis and pathogenesis of Prostate cancer. In particular, some tumorassociated proteins have been recognized in tissues and body fluids of several cancer (5).
The software designed to identify and analyze proteins plays an important role in investigating proteins from mass spectrometry and two-dimensional gels. To identify proteins, the obtained information about proteins matches in conflict with protein database to describe a protein which is new or identified earlier and to analyze proteins, all the information present in protein databases can be used to calculate properties about a protein, which can be useful for its experimental study. There is a software known as ExPASy (Expert Protein Analysis System) available for protein analysis and identification, which predicts protein's isoelectric point, molecular weight, physiochemical properties, transmembrane domain, structure and function of proteins (6).
The ExPASy (Expert Protein Analysis System) server used for the analysis of proteins is a proteomics server of the Swiss Institute of Bioinformatics (SIB) which evaluates protein sequences, its functions and structures in association with European Bioinformatics Institute. This server consists of a group of tools which analyze and identifies different types of proteins and characterize by using primary structure, mass spectrometric data, profile searches and pattern, secondary structure, predicts and analyze tertiary sequence and structure, visualization, molecular modeling and quaternary structure analysis (7).
One of the protein analysis tools is ProtParam, present on the ExPasy server used to calculate/computation of several physiochemical properties of proteins with the help of protein sequence. ProtParam computed various parameters of a protein which includes amino acid composition, theoretical pI, molecular weight, extinction coefficient, instability index, grand 6 average of hydropathicity (GRAVY), estimated half-life, aliphatic index and atomic composition (6).
Another tool is Protscale that calculates and signifies the protein profile created by amino acid scale on a selected protein. The amino acid scale is a numerical value allotted to each amino acid. Mostly secondary structure conformational parameters and hydrophobicity/hydrophilicity scales are used, however other different scales are also exist based on physical and chemical properties of amino acids (6). A new secondary structure prediction method for proteins from its amino acid sequences is PSIPRED (8).

Sample Collection
All sampling was performed in civil hospital and BMC hospital Quetta. Total 50 blood samples of prostate cancer were taken from PCa patients and 10 were taken as control from healthy subjects. Samples were collected in sterile vacutainer tubes and sera were obtained by centrifugation at 12000 rpm for 5 minutes. Samples were further processed in CASVAB (University of Balochistan) Biotechnology Lab.

Sodium Dodecyl Sulfate Polyacrylamide Gel Electrophoresis (SDS -PAGE)
SDS-PAGE was performed in order to separate proteins on gel matrix. Electrophoresis is a method of separating a complex mixture of proteins. A technique by which charged molecules are displaced with electric current through a gel matrix is known as Sodium Dodecyl Sulfate Polyacrylamide Gel Eectrophoresis (SDS-PAGE). This method is used to determine the composition of the protein subunits, to control the homogeneity of the protein sample and for use in other applications to purify the proteins. The protein migration rate during SDS-PAGE is known by the charge and pore size of the gel matrix, shape and size of the protein (9,13,14).

Bioinformatics tools used
Bioinformatics and proteomics tools were applied in order to check the different properties of proteins. PSIpred tool was used to predict the physiochemical properties, ProtParm tools determine the secondary structure of proteins and ProtScale tool was used to find the transmembrane domains of the proteins.
Simple random sampling method will be used to collect data from teachers working in schools of Islamabad and Rawalpindi. A closed ended questionnaire used will be and measures will be adopted from the previous studies existing in literature.

RESULTS AND DISCUSSION
From the serum samples of prostate cancer patients the protein profiling was performed using SDS-PAGE. Out of 50 serum samples five proteins (which may correspond to PSA, TRAP1, ANO7, CDC37, and Clusterin) were differently expressed comparing a pool of ten control sera. These proteins were further subjected to the bioinformatics analysis using different tools in order to check their transmembrane domains, secondary structure and physiochemical properties.

Prediction of transmembrane domains in detected proteins using ProtScale Bioinformatics tool.
The knowledge of whether the detected proteins have trans-membrane domains tells us a lot about possible biological functions of the proteins, the physicochemical properties and the behavior during purification. According to observations of Wang et al., our detected proteins with molecular weights 32kDa,, 75kDa,, 99.8kDa, 44kDa, and 75kDa might correspond to Triosephosphate isomerase 1, Peroxiredoxin 4, Prohibitin, 60S acidic ribosomal protein P0, Cytoplasmic actin. We performed ProtScale analysis to predict their transmembrane domains. The accession numbers (accession numbers of the proteins were confirmed by UniProt data base) of the proteins and 'range' to see the hydrophobicity profile is given in table no lI.
The computation was performed on the complete sequence of amino acids of detected proteins. Putative transmembrane (TM) domains were obtained using the scale Hphob. / Kyte & Doolittle (1982), hydrophobicity profile of proteins shown at y-axis and position of the amino acids shown at x-axis on the algorithms. Weights for window positions were 1... 19, using linear weight variation model (19 is optimal for transmembrane domains).
The results revealed that the transmembrane domains were found more in ANO7, few or less in PSA and CDC37, moderate in TRAP1 while null in clusterin. For finding transmembrane domains the values calculated should be above 1.5 (10). According to the Physiochemical properties the theoretical molecular weight of Prostate Specific Antigen, Tumor necrosis factor type 1 receptor-associated protein, Anoctamin-7, CDC37 and clusterin is 26.09, 73.5, 105.5, 16.4 and 24.1 kDa respectively, whereas the observed experimental molecular weight is (32, 75, 99.8, 44 and 75kDa). However the Theoretical PI of Prostate Specific Antigen, Tumor necrosis factor type 1 receptor-associated protein, Anoctamin-7, CDC37 and clusterin was (7.26, 6.13, 8.11, 5.24 and 6.27), whereas Physicochemical properties are shown in Table II.
Transmembrane domains compromise of nonpolar amino acid residue which go across the bilayer many times it also contain α helices. The membrane-associated proteins and Transmembrane domains reside locations in the cell membrane where the protein concentration in two dimensions is excessive which employ a large impact on functions and clustering (11). PSIPRED predicts protein secondary structure and gives to prediction server, GenTHREADER recognizes the structure and folds of a protein and MEMSAT 2 is a latest method hypothesizes topology of transmembrane proteins and protein structure (12). Whereas in the secondary structure the properties observed were helix, coil, disorders, membrane interaction, and membrane helix, polar, non-polar and hydrophobic regions.

CONCLUSION
The aim of the present study was to predict the physiochemical properties, transmembrane domains and secondary structure of proteins extracted by PCa patients using different bioinformatics tools.Bioteconologicts are currently using bioinformatics tools for predicting 3D structure of proteins and finding relationship between amino acid sequences. In future advanced bioinformatics tools should be introduced for their better analysis.