1471-2105-12-36.pdf

(1183 KB) Pobierz

et al. BMC Bioinformatics

2011,

12:36

http://www.biomedcentral.com/1471-2105/12/36

RESEARCH ARTICLE

Open Access

ASPDock: protein-protein docking algorithm

using atomic solvation parameters model

Lin Li, Dachuan Guo, Yangyu Huang, Shiyong Liu

, Yi Xiao

Abstract

Background:

Atomic Solvation Parameters (ASP) model has been proven to be a very successful method of

calculating the binding free energy of protein complexes. This suggests that incorporating it into docking

algorithms should improve the accuracy of prediction. In this paper we propose an FFT-based algorithm to

calculate ASP scores of protein complexes and develop an ASP-based protein-protein docking method (ASPDock).

Results:

The ASPDock is first tested on the 21 complexes whose binding free energies have been determined

experimentally. The results show that the calculated ASP scores have stronger correlation (r

≈

0.69) with the

binding free energies than the pure shape complementarity scores (r

≈

0.48). The ASPDock is further tested on a

large dataset, the benchmark 3.0, which contain 124 complexes and also shows better performance than pure

shape complementarity method in docking prediction. Comparisons with other state-of-the-art docking algorithms

showed that ASP score indeed gives higher success rate than the pure shape complementarity score of FTDock

but lower success rate than Zdock3.0. We also developed a softly restricting method to add the information of

predicted binding sites into our docking algorithm. The ASP-based docking method performed well in CAPRI

rounds 18 and 19.

Conclusions:

ASP may be more accurate and physical than the pure shape complementarity in describing the

feature of protein docking.

Background

Most proteins interact with other proteins to perform

their biological functions in the form of protein com-

plexes. During the past several decades, many docking

programs have been developed to predict protein-pro-

tein complexes. Among them, the docking algorithms

based on Fast Fourier Transform (FFT) are widely used

and have made great success [1] because they can search

6D space in a very fast way. These programs include

MolFit [2], 3D-Dock [3-5], GRAMM [6], ZDock [7,8],

DOT [9], BiGGER [10] and HEX [11]. The base of the

original FFT-based docking method is shape comple-

mentarity between receptor and ligand. It is usually

used as the first step of docking procedure and then

other methods are used to refine or re-rank the docked

structures [3,12,13]. Besides the FFT-based algorithms,

there are other well-known docking algorithms that also

* Correspondence: liushiyong@gmail.com; yxiao@mail.hust.edu.cn

Biomolecular Physics and Modelling Group, Department of Physics,

Huazhong University of Science and Technology, Wuhan 430074, Hubei, PR

China

consider flexibility of proteins during docking proce-

dure, like RosettaDock [14], ICM-DISC [15], AutoDock

[16], or HADDOCK [17]. Since the original FFT dock-

ing algorithm only used shape complementarity feature

to solve bound docking problem [1], different scoring

functions based on other physical features have been

integrated into the original FFT-based docking method

to improve the prediction ability. For examples, the 3D-

DOCK [18] added electrostatic energy into the FFT-

based docking method. ZDOCK [7] used atomic contact

energy to calculate solvation energy. The hydrophobic

docking method [19] combined hydrophobic comple-

mentarity with shape complementarity [20]. GRAMM

used a long-distance potential [21] to calculate atom-

atom van der Waals energy which has proved effective

in detecting binding funnels.

Reliable scoring function is crucial to enhance success

rate of prediction of protein-protein docking. Cheng and

co-workers [22] analyzed the performance of different

energy components in protein-protein interactions. They

showed that the sum of solvation and electrostatic

Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in

any medium, provided the original work is properly cited.

et al. BMC Bioinformatics

2011,

12:36

http://www.biomedcentral.com/1471-2105/12/36

Page 2 of 9

energies contributes more than 70% to the total binding

free energy, while van der Waals energy only contributes

less than 10%. Fernandez-Recio’s work also showed that

rather than electrostatic, van der Waals and hydrogen-

bond energies, solvation energy [23] is the most impor-

tant component in the total binding energy. Zhou et al.

[24] found that the correlation coefficient between sol-

vation energy and experimental binding energy is 0.83

with a root mean square deviation (RMSD) of 2.3 kcal/

mol, and the most important is that the slope is close to

1 ( slope = 0.93 ).

ASP (Atomic solvation parameters) model is one of

the best methods to calculate solvation energy. Due to

its fast and efficient feature, ASP model [25-27] has

made great success in free energy calculation [28,29],

structure prediction [30,31], and scoring functions

[22,32]. This suggests that if we integrate ASP into the

sampling stage of docking algorithms, it may enhance

the success rates of docking. Up to now, several groups

have constructed different ASP sets [25-27]. Among

them, Zhou’s set [24] is the most suitable one for calcu-

lating the solvation energy of protein complexes. This

ASP set was extracted from 1023 mutation experiments

and yielded an accurate prediction of free binding

energy of complexes. In this paper, the ASP set from

Zhou’s work is used to develop an ASP-based protein-

protein docking algorithm (ASPDock).

During a prediction procedure, correct auxiliary infor-

mation (e.g., predicted binding sites) usually can

increase the success rate significantly [33-36], but incor-

rect auxiliary information may mislead predictors and

lead to worse predictions. However, we hardly distin-

guish whether the information is correct or not before

the complex structure is experimentally solved. In this

work, we present a softly restricting method of using

biological information in which we constrain receptor

and ligand partially within the predicted binding sites.

Using our ASPDock algorithm with softly restricting

method, we participated in two rounds of Critical

Assessment of PRediction of Interactions (CAPRI) [37].

There are 3 targets (T40, T41, and T42) in rounds 18

and 19. We got high-quality hits for T40 and T41 and

the best LRMSD were 2.35 Å and 1.41 Å, respectively.

the speed of the FFT-based docking method, we propose

an approximate FFT method to calculate the ASA and

so ASP values (see the section

“Methods”).

We first test our method on the 21 protein complexes

[24], whose binding free energies have been measured

experimentally. For each complex, we perform a bound

docking and select the best structure close to the native

state. Usually the LRMSD between the best structure

and native structure is less than 0.5Å, and so we con-

sider the ASP score of the best structure as that of the

native structure. Using similar method we can calculate

the shape complementarity score for each of the 21

complexes. Obviously, if we set all ASP values equal to

one, what we calculated in our method is the shape

complementarity score.

We compared the ASP and shape complementarity

scores with the experimental binding free energy for

each of the 21 complexes. The correlation coefficient

between the ASP scores and experimental binding ener-

gies of the 21 complexes is 0.6868, and that between the

shape complementarity scores and experimental binding

energies is 0.4843 (Figure 1). In Zhou’s work [24], the

correlation coefficient between the ASP scores and

experimental binding energies of the 21 complexes is

0.83 since they used a more accurate method to calcu-

late the ASA than us. This shows that our approximate

method can count most part of the binding free energy

and is better than pure shape complementarity method.

The later is easily understood because the shape com-

plementarity is a reduced ASP model by taking all

atoms as the same.

Benchmark Calculation

Results and Discussion

Free Energy Calculation

The ASP used here is from Zhou and co-workers [24],

which contain only six atom types. It proved to be suc-

cessful in predicting binding free energy of complexes.

ASP model assumes that the solvation energy of an

atom or an atom-group is proportional to its solvent

accessible surface area (ASA). Accurate calculation of

ASA, which depends on the conformation of proteins or

complexes, is a time consuming job. In order to meet

Our algorithm is further tested on the benchmark 3.0

[38] by using both the ASP and shape complementarity

scores. There are 124 protein-protein complexes, which

contain 24 antibody-antigen complexes, 35 enzyme-inhi-

bitor complexes and 65 other complexes. In the docking

sampling stage, we use 10 degree as a step for the rota-

tional scanning. Success in top N predictions is defined

as that at least one acceptable hit is found in top N pre-

dictions. Acceptable hits stand for those predicted com-

plexes with

≤

10Å LRMSD with respect to the native

complex structure. LRMSD is the RMSD between the

predicted and native ligand molecules after superposing

predicted and native receptor molecules. No predicted

and experimental information is used in the docking

process. Result shows that ASP method enhances the

success rate significantly (Figure 2) in comparison with

shape complementarity score.

As in other docking methods, the prediction of

enzyme-inhibitor complexes has a higher success rate

than antibody-antigen complexes and other complexes.

That is mainly because enzyme-inhibitor complexes

et al. BMC Bioinformatics

2011,

12:36

http://www.biomedcentral.com/1471-2105/12/36

Page 3 of 9

Figure 1

Correlation between experimental free binding energy and docking score.

(A) Correlation between experimental free binding

energy and shape complementarity score. (B) Correlation between experimental free binding energy and ASP score. Calculation of shape

complementarity score and ASP score are both based on FFT method. Grid step is 1 Å here.

usually have better interface features than other types of

complexes [39]. Success rate of antibody-antigen com-

plexes is not as high as in some other methods [5,7,40].

However, complementarity determining regions (CDR)

of antibodies can be predicted by sequence [41]. If we

utilize this (CDR) information, success rate of antibody-

antigen complexes should be enhanced dramatically. In

general, ASP method can enhance the success rate

significantly.

We also compared our results with the popular dock-

ing algorithms FTDock [5] and ZDock [7,8] using the

Benchmark 3.0 (Figure 2B and also Additional file 1).

The former can be used to compare the performance of

ASPDock relative to a pure shape complementarity

method and the later can be used to judge the perfor-

mance of a single ASP score relative to the best docking

method integrating many important factors of protein

interactions. The results show that the ASP score indeed

gives higher success rate than the pure shape comple-

mentarity score of FTDock but lower success rate than

ZDock. The former shows that

“ASP

complementarity”

is more reasonable for describing the interface character

of protein-protein interaction than pure shape comple-

mentarity. The later is expected because ASPDock is

Figure 2

Success Rate on benchmark 3.0.

(A) Success Rates of ASPDock method on antibody-antigen complexes, enzyme-inhibitor complexes,

other complexes and total complexes, respectively. (B) Success Rates of ASPDock, GeoDock (Shape complementarity docking using ASPDock

algorithm), FTDock and ZDock3.0 methods on the benchmark 3.0.

et al. BMC Bioinformatics

2011,

12:36

http://www.biomedcentral.com/1471-2105/12/36

Page 4 of 9

only to search a more physical model of pure shape

complementarity for protein docking and needs integrat-

ing more important factors of protein interactions to get

a higher success rate of prediction.

CAPRI Rounds 18-19

Using our ASPDock and softly restricting method, we

participated in the CAPRI rounds18 and 19, which con-

tain three targets, T40 in round 18 and T41, T42 in

round 19. We got one high quality prediction for each

of T40 and T41 (Figure 3), but no correct prediction for

T42. During the docking procedure, we searched the

structural space in 5 steps as follows: (1) Searching the

binding site information of receptor and ligand from lit-

erature; (2) Scanning the six dimensional space by using

ASPDock method with the amplified ASP valuesr

; (3)

Picking out the top 2000 structures, clustering them and

choosing the structures ranking first in each of the top

20 clusters. In this step, the structures are ranked

directly according to their ASP values. (4) Refining the

20 structures using RosettaDock [14] and obtaining a

set of new structures; (5) Re-ranking the structures

using scoring function, clustering them, and then choos-

ing the structures ranking first in each of the top 10

clusters. The scoring functions we used are Rossetta-

Dock [14] and DECK(Shiyong Liu and Ilya Vakser,

submitted). The weight of RossettaDock and DECK

scores is 1:1.

The target T40 (Figure 4) is a complex between

bovine trypsin (1BTY) and the double-headed arrow-

head protease inhibitor API-A (bound). Some important

information shared by Dr. Weng from Boston University

shows that the two active sites of the inhibitor are

Leu87 and Lys145 (Figure 4A). We incorporated this

information into the ASP docking of T40 by using a

softly restricting docking method with the amplification

factor

being set as 3. For comparison, we also did a

totally free docking without using any information of

binding sites by shape complementarity method (Figure

4B) and by ASPDock method (Figure 4C). Although free

docking can find some structures binding at the residues

Leu87 and Lys145, softly restricting ASP docking can

greatly enhance the sampling around them (Figure 4D).

There is one high quality and six medium hits in our

ten submitted structures. The best LRMSD between our

hit and experimental measured structures is 2.35 Å.

T41 is the DNase domain of colicin E9 (G95C

mutant) in complex with the Im2 immunity protein

(C23A/E31C mutant). The unbound coordinates pro-

vided are: E9 DNase domain (1FSJ) and Im2 from the

NMR ensemble (2NO8). We got one high quality hit

and eight acceptable hits in our ten submitted struc-

tures. The best LRMSD is 1.41 Å (Figure 3).

T42 is a symmetric homodimer and designed based on

Lynn Regan’s idealized TPR (1NA3). Residues 1-4 and

108-125 are disordered. We didn’t get any acceptable

Figure 3

Native and predicted structures of T40 and T41 in CAPRI.

Our submitted structures are represented by mass center model. Blue

balls represent incorrect structures, yellow balls represent acceptable hits, magenta balls represent medium hits and red balls represent high

quality hits.

et al. BMC Bioinformatics

2011,

12:36

http://www.biomedcentral.com/1471-2105/12/36

Page 5 of 9

Figure 4

Results of CAPRI T40 predicted by different methods.

Top 3000 structures obtained by ASPDock with and without the information

of predicted binding sites. Green protein is the double-headed arrowhead protease inhibitor (API-A). Orange residues are the key residues,

Leu87 and Lys145. Small balls are mass centres of ligands. There are 3000 ligand-mass centres in each figure, representing top 3000 structures of

ligands. The ASP scores are ranked form red to blue color. (A) Native structure of T40. (B) Top 3000 ligands generated by shape complementarity

method. (C) Top 3000 ligands generated by ASPDock method. (D) Top 3000 ligands generated by softly restricting ASPDock method.

hits of this target (in fact there were only few hits in all

predictions from the groups that participated in this

CAPRI round).

Conclusions

We proposed an easy way to incorporate ASP model

into FFT protein-protein docking method, which can

calculate the solvation energy approximately but quickly.

This ASPDock method is reduced the FFT docking

method of pure shape complementarity when the ASP

values of all atoms are set to be 1. The scores of

ASPDock reflect solvation energy, which has proved to

be the most significant energy among all kinds of ener-

gies in binding free energy. On the contrast, pure shape

complementarity has no clear physical meaning. Our

results indicate that the ASPDock method can enhance

the prediction accuracy significantly in comparison with

the pure shape complementarity method.

A softly restricting method was also proposed to

incorporate the predicted binding sites into the

ASPDock method. This method is more reasonable than

the strictly restricting method, which will definitely miss

the correct complex structure when the information is

incorrect.

Plik z chomika:

xyzgeo

Inne pliki z tego folderu:

lecture(2).ppt (4091 KB)
0544(4).pdf (4024 KB)
proteindocking(5).pdf (1853 KB)
gkq311(4).pdf (4167 KB)
1471-2105-12-36(4).pdf (1183 KB)

1471-2105-12-36.pdf

Plik z chomika:

Inne pliki z tego folderu:

Inne foldery tego chomika: