Bangladesh J Pharmacol. 2014; 9: 428-36

Available Online: 9 September 2014; DOI: 10.3329/bjp.v9i3.19716


Evolutionary analysis of galectins and identification of potential galectin-1 inhibitors: A computational approach

Vipperla Bhavaniprasad, Krishnan Anbarasu and Sivaraman Jayanthi

School of Biosciences and Technology, VIT University, Vellore, Tamilnadu 632 014, India.

Abstract

Galectins are a family of structurally related carbohydrate-binding proteins and some galectins play a major role in initiation, progression and dissemination of different types of tumors. Multiple sequence alignment was performed for 15 types of galectins and phylogenetic tree was constructed for studying evolutionary relationship. Among galectins, galectin-1 contributes to various events associated with cancer biology including tumor transformation, cell cycle regulation and apoptosis. Hence a rational computational approach was followed for the design of new class of glycol-mimetic inhibitors with high affinity and stability. Ten N-39-triazole analogs have been used for molecular docking with galectin-1. Based on docking studies, hexaconazole is identified as a potential inhibitor of galectin-1 for the inhibition of the tumor activity. The binding mechanism of hexaconazole to galectin-1 in the dynamics system was studied by 10 ns Molecular Dynamics simulation. Thus, our study favors more insight on hexaconazole as a promising inhibitor for galectin-1.

Keywords: Galectin-1; Molecular docking


Introduction

Galectin, a type of lectin, adheres the β-galactoside. The galectins comprise of a β-galactoside adhering lectins bearing a homologous carbohydrate recognition domains (CRDs) (Ajit et al., 2009). The CRDs of galectins consists relatively 130 amino acids, despite only a few residues within the CRDs precisely influencing the glycan ligands. So far 15 galectins have now been identified in mammals and among them 12 galectins are found in humans. The galectins are divided into three major groups. The prototypical galectins comprise of galectins-1, 2, 7, 10, 13, 14, which have only one CRD (Barondes et al., 1994). The tandem repeat galectins comprising the galectin- 4, 8, 9, 12, include two CRDs and a single polypeptide linked by tiny peptide domain.

Galectin-1, a homodimer composed of subunits and each subunit folds as one compact globular domain (Leffler, 2001). Galectin-1 can function in both carbohydrate-dependent and independent manners either positive or negative depending on responder cell types or sub cellular localization (Rabinovich et al., 2002). The endogenous protein may function as a growth-promoting factor; exogenously added galectin-1 specifically suppresses tumor cell proliferation (Mercier et al., 2008). Galectin-1 induces cell growth inhibition, inhibits T-cell activation and promotes apoptosis of activated T cells (Levroney et al., 2005). Naturally occurring carbohydrate ligands for galectins have low affinities, too polar and possess low physiological stabilities (Giguere et al., 2006). Some N-39-triazole analogs provided high affinity enhancement and can be considered as possible inhibitors for galectin-1 (Hope et al., 2008).


Materials and Methods

Evolutionary analysis of galectins: The protein sequences of all the 15 types of galectins available were retrieved from NCBI protein database. These sequences were aligned using multiple sequence alignment method using the ClustalW online tool. From the alignment scores obtained from ClustalW, a phylogenetic tee has been constructed using a software tool, TreeView. The evolutionary relationship that exists between the different types of galectins, their existence and their diverse functions has been analyzed from the constructed phylogenetic tree.

Galectin-1 protein preparation: The crystal structure of galectin-1 was retrieved from Protein Data Bank (PDB) with ID: 1GZW (Lopez-Lucendo et al., 2004). The binding sites available in PDB for galectin-1 were used for the molecular docking analysis. Triazole refers to either one of a pair of isomeric chemical compounds having a molecular formula C2H3N3, having a five membered ring of two carbon atoms and three nitrogen atoms. The class of triazole consists of ten analogs and their structures were obtained from PubChem database. The ligand preparation was performed using PRODRG web server (Schuttelkopf et al., 2004).

Molecular docking: Autodock 4.2 is used for molecular docking. Autodock used the binding free energy evaluation to find the best binding mode. The docking energy was obtained from the van der Walls energy and hydrogen bonding energy together, while the binding energy is built up from van der Walls energy and desolvation energy (van Grotthuss et al., 2003). The binding strength and the location of ligand in most of the cases can be decided by the electrostatic interaction between ligands and receptor. The hydrophobic interaction obtained from the docking can affect the agonistic activity to a larger extent (Morris et al., 1998).

Molecular dynamics simulations: MD simulations were carried out using GROMACS 4.5 (Berendsen et al., 1995). The minimized structure of galectin-1 as well as the docked complex was employed in the MD simulation process. GROMOS 43a1 forcefield was used for complex MD simulations (van Gunsteren et al., 1996). The force field parameters of ligand were obtained from PRODRG web server. The aim of MD simulation was to get more precise receptor-inhibitor models in a state close to the natural conditions and to explore the binding modes of the ligands further. Although molecular docking offers reasonable binding structures for investigated ligands, the MD simulation can account for even the smallest variances. Eight Cl- counter ions were added by replacing water molecules to ensure the overall charge neutrality of the simulated system. Energy minimization process, position restraint procedure was performed in association with constant Number of particles, Volume and Temperature ensemble (NVT) and constant Number of particles, Pressure and temperature ensemble (NPT). An NVT ensemble was employed at constant temperature of 300 K with time duration of 100 ps. After stabilization of temperature an NPT ensemble was performed. In this phase a constant pressure of 1 bar was employed with a coupling constant of 5 ps with time duration of 100 ps. The coupling scheme of Berendsen was employed in both of NVT and NPT ensembles. And a final produc-tion mdrun was carried for 10 ns for the trajectory analysis.

Principal component analysis (PCA): In order to identify the dominant motions of galectin-1, we have used principal component analysis method which takes the trajectory of a MD simulation and extracts the dominant modes involved in the motion of the molecule (Amadei et al., 1993). A covariance matrix is built using a simple linear transformation in Cartesian coordinate space and the diagonalization of the covariance matrix is carried out which generates a set of eigenvectors which gives a vectorial description of each component of the motion by indicating the direction of motion. Each eigenvector has a corresponding eigenvalue which explains the energetic contribution of each component to the motion (Mesentean et al., 2006). The protein regions which are responsible for the most relevant collective motions can be identified using this analysis. PCA is performed on galectin-1 using the inbuilt gromacs utilities g_covar and g_anaeig.

Analysis: All the visualizations were carried out using Pymol (Delano, 2009), VMD tools (Humphrey et al., 1996) and graphs were plotted using XMgrace Program (Turner, 2005). The trajectories were analyzed using the inbuilt tool in the GROMACS distribution.


Results and Discussion

The 15 types of galectins were categorized based on the source of the organism, distribution of the protein in various organs, molecular mass and specific functions (Table I). The multiple sequence alignment has been performed using ClustalW for the 42 sequences of different types of galectins from various species. A phylogenetic tree has been constructed using the TreeView based on the alignment obtained from ClustalW.

From the phylogenetic tree (Figure 1), galectin clades had been analyzed. The Phylogenetic tree consists of three clades. The representing galectin-3 sequence from Homo sapiens is used as the reference sequence which falls under clade–II. The clade-II consists of Galectin-1 class of proteins from various species which is observed to be diverged slightly based on the mutation. The clade-II is further classified into two sub classes of galectin family which is further classified into various branches comprising of galectins- 1, 3, 7 and 12. The reference sequence found to be less diverged from clade-I which consists of galectins- 2, 10, 13 and 15 and it is more distantly diverged from clade-III which consists of galectins- 4, 5, 6, 8, 9, 11 and 14. The overall tree showed the current trend of molecular evolution of galectin family with minimal change among galectins under clade II and more changes in clade-I and clade-III of galectin Phylogenetic tree. All the galectins from Clade-II exhibits a functional similarity also, all of these galectins- 1, 3, 7, 12 play a vital role in the apoptosis activity (Ajit et al., 2009). The important observation was galectins- 1, 3, 7, 12 which were closely related share a functional similarity, with all of them being vital in the apoptosis activity.

The 3D structure of Homo sapiens galectin-1 was downloaded from PDB database with the code 1GZW. The active sites already available for drug target galectin-1 in PDB were used for docking studies. The binding sites were critical in the binding of ligands with the drug target galectin-1. A molecular docking study was performed for a dataset of 10 triazole analogues with the drug target galectin-1. Based on the lowest binding energy obtained for each ligand, the top inhibitors were ranked. The binding free energy of ten screened inhibitors scored by Autodock ranges between -3 to -8 Kcal/mol. The results obtained for all ten compounds are similar having a consistent hydrophobic cavity in all cases.

Hexaconazole accounted for seven H-bonds all with side-chain atoms of galectin-1, with Asn46 contributing for three and Arg48, Ser29, His44, His52 forming one each (Figure 2 and Figure 3). Fluconazole formed four H-bonds with all of them formed with the side chain atoms of galectin-1, with Asp54 accounting for two and Arg48, His52 forming one each respectively. Itraconazole formed five H-bonds with Asn46 contributing to two and Arg48, Arg73 and Glu71 contributing for one each and all of them formed with the side chain atoms of the protein. Epoxiconazole contributed for four H-bonds one each with the side chain atoms of Arg48, Asn46, His44 and Glu71. Posaconazole formed three H-bonds with Arg48, Arg73 and Glu71 contributing for one each and all of them formed with the side chain atoms of the protein.

In case of voriconazole, two H-bonds were formed one each with the sidechain of Arg48 and Asp125. Propiconazole formed three H-bonds one each with Glu71, Arg48 and Asn46 having all of them formed with the side chain atoms of the protein. Terconazole is involved in the formation of two H-bonds with Arg48 and His44 contributing for one each respectively with their side chain atoms. Isavuconazole formed a lone H-bond with the side chain atom of His44. Albaconazole also formed a single H-bond with the side chain atom of Arg48. The Docking analysis of the ten complexes shows that majority of the ligands binds in a similar fashion with small variations. Nine out of the ten compounds showed interactions with the electrically charged sidechain atoms of Arg48 which is a key residue in the binding pocket. In cases of hexaconazole, itraconazole, epoxiconazole, propiconazole complexes it was observed that the residue Asn46 is interacting with the azole ring of the ligands (Table II).

The inhibitor complex of galectin-1 and hexaconazole obtained from docking analysis was used for molecular dynamics simulations for further analysis of binding mode and the effects of the binding of ligand on the conformation of protein. The trajectories of the complex were obtained for a period of 10 ns. The RMSD of the protein showed an initial rise for a period of 3 ns, commonly attributed for the breakdown from the crystallographic structure and after that it remained stable for the rest of the simulation period (Figure 4) (Levitt et al., 1988). The total energy of the complex has been stable throughout the period of 10 ns (Figure 4B).

The dynamical mechanical properties of the simulated system can be analyzed using PCA. A covariance matrix for the complete trajectory of simulation during a period of 10 ns has been built. The analysis of the eigen vectors from the PCA showed that first seven eigen vectors were responsible for more than 85% of the total motion in the model. The dynamics of a protein can be best achieved by its behavior in phase space. The projection of the trajectories at 300 K onto the first two principal components (PC1 and PC2), illustrates the motion of the protein in phase space (Kundu et al., 2009). Analysis of these projections shows the clusters of stable states for the protein. The tenth largest eigen value is found to be less than one-tenth of the first largest eigen value. The first principal component dominates the motion of the protein (Figure 5). A Gibbs free energy landscape is constructed over the first and second largest principal components which gives a clear description of the folding dynamics of the protein (Figure 6) (Papaleo et al., 2009; Maisuradze et al., 2012).


References

Amadei A, Linssen ABM, Berendsen HJC. Essential dynamics of proteins. Proteins. 1993; 17: 412-25.

Barondes SH, Cooper DNW, Gitt MA, Leffle H. Galectins: Structure and function of a large family of animal lectins. J Biol Chem. 1994; 269: 20807-10.

Berendsen HJC, Van Der Spoel D, Van Drunen R. GROMACS: a message-passing parallel molecular dynamics implementation. Comp Phys Comm. 1995; 91: 43-56.

DeLano WL. Pymol. Schrodinger LLC, 2009, Portland, OR.

Giguere D, Patnam R, Bellefleur MA, St-Pierre C, Sato S, Roy R. Carbohydrate triazoles and isoxazoles as inhibitors of galectins-1 and -3. Chem Commun. 2006; 22: 2379-81.

Hope WW, Billaud EM, Lestner J, Denning DW. Therapeutic drug monitoring for triazoles. Curr Opin Infect Dis. 2008; 21: 580-86.

Humphrey W, Dalke A, Schulten K. VMD: visual molecular dynamics. J Mol Graphics. 1996; 14: 33-38.

Kundu S, Roy D. Comparative structural studies of psychrophilic and mesophilic protein homologues by molecular dynamics simulation. J Mol Graph Model. 2009; 27: 871-80.

Leffler H. Galectins structure and function: A synopsis. Results Probl Cell Differ. 2001, 33: 57-83.

Levitt M, Sharon R. Accurate simulation of protein dynamics in solution. Proc Natl Acad Sci. 1998; 85: 7557-61.

Levroney EL, Aguilar HC, Fulcher JA, Kohatsu L, Pace KE, Pang M, Gurney KB, Baum LG, Lee B. Novel innate immune functions for galectin-1: Envelope glycoproteins and augments dendritic galectin-1 inhibits cell fusion by nipah virus cell secretion of proinflammatory cytokines. J Immunol. 2005; 175: 413-20.

Lopez-Lucendo MF, Solis D, Andre S, Hirabayashi J, Kasai K, Kaltner H, Gabius HJ, Romero A. Growth-regulatory human galectin-1: Crystallographic characterization of the structural changes induced by single-site mutations and their impact on the thermodynamics of ligand binding. J Mol Biol. 2004; 343: 957-70.

Maisuradze GG, Zhou R, Liwo A, Xiao Y, Scheraga HA. Effects of mutation, truncation, and temperature on the folding kinetics of a ww domain. J. Mol. Biol. 2012; 420: 350-65.

Mercier S, St-Pierre C, Pelletier I, Ouellet M, Tremblay MJ, Sato S. Galectin-1 promotes HIV-1 infectivity in macrophages through stabilization of viral adsorption. Virology 2008; 371: 121-29.

Mesentean S, Fischer S, Smith JC. Analyzing large-scale structural change in proteins: Comparison of principal component projection and Sammon mapping. Proteins 2006; 64: 210-18.

Morris GM, Goodsell DS, Halliday RS, Huey R, Hart WE, Belew RK, Olson AJ. Automated docking using a lamarckian genetic algorithm and empirical binding free energy function. J Comput Chem. 1998; 19: 1639-62.

Papaleo E, Mereghetti P, Fantucci P, Grandori R, De Gioia L. Free-energy landscape, principal component analysis, and structural clustering to identify representative conformations from molecular dynamics simulations: The myoglobin case. J Mol Graph Model. 2009; 27: 889-99.

Rabinovich GA, Rubinstein N, Toscano MA. Role of galectins in inflammatory and immunomodulatory processes. Biochim Biophys Acta. 2002; 1572: 274-84.

Schuttelkopf AW, van Aalten DMF. PRODRG: A tool for high-throughput crystallography of protein-ligand complexes. Acta Crystallogr. 2004; D60: 1355-63.

Turner PJ. XMGRACE, Version 5.1.19. Center for Coastal and Land-Margin Research, Oregon Graduate Institute of Science and Technology, Beaverton, OR. 2005.

Van Gunsteren WF, Billeter SR, Eising AA, Hunenberger PH, Kruger P, Mark AE, Scott WRP, Tironi IG. Biomolecular simulation: The GROMOS96 manual and user guide. Vdf Hochschulverlag AG an der ETH Zurich, Zurich, Switzerland, 1996; 1-1042.

Varki A, Cummings RD, Esko JD, Freeze HH, Stanley P, Bertozzi CR, Hart GW, Etzler ME. Essentials of glycobiology. 2nd ed. Cold Spring Harbor NY, U.S.A, Cold Spring Harbor Laboratory Press, 2009.

Von Grotthuss M, Pas J, Rychlewski L. Ligand: Info, searching for similar small compounds using index profiles. Bioinformatics 2003; 19: 1041-42.