Bioinformatics & Stuff

24 de abril de 2014

Ab initio in Rosetta Suite

This is a resumed protocol for running the Ab initio program from Rosetta Suite.

There are 3 files that you require to run the Ab initio protocol:

- Your sequence to be modeled

- Fragment 03

- Fragment 09

"Fragment Libraries are the pieces of experimentally determined structures that Rosetta uses to guide the search of conformational space when predicting structures using the Ab initio protocol."

I recommend to use the Robetta Server to build your fragment libraries here. Then you will have the option to download the files, and also to download secondary structure information in a ss2 file. This will be helpful to the construction of your model.

The next step is to get into the Ab initio program. You will need to call the program with the full path i.e. /usr/local/rosetta3.5/bin/AbinitioRelax.default.linuxgccrelease. Then you need to run your program with the following flags:

-in:file:fasta your_seq.fasta

-in:file:frag3 /PATH/TO/FRAG3_FILE

-in:file:frag9 /PATH/TO/FRAG9_FILE

-database /PATH/TO/rosetta_database

-abinitio:relax

-relax:fast

-psipred_ss2 t000_.psipred_ss2.txt

-out:pdb -abinitio::increase_cycles 10

-abinitio::rg_reweight 0.5

-abinitio::rsd_wt_helix 0.5

-abinitio::rsd_wt_loop 0.5

-out:file:silent name_silent.out

-nstruct X

The number in nstruct flag will depend in the amount of structures you want to generate.

This Ab initio protocol performs in a good manner for small monomeric proteins in the range of 100 or less residues. For larger protein you will need to generated a nstruct in the range of 6,000 to 20,000 models, which need higher computing to encompass a larger conformational space.

Example of the result from an Ab initio protocol. The protein correspond to the B20 antibody, showing in yellow the model of the variable region.

The following references provide information relevant to the sampling problem:

Bradley P, Misura KM, Baker D (2005). Toward high-resolution de novo structure prediction for small proteins. Science 309, 1868-71.
Kim DE, Blum B, Bradley P, Baker D (2009). Sampling bottlenecks in de novo protein structure prediction. J Mol Biol 393, 249-60.

More info in the Rosetta Commons Documentation.

10 de noviembre de 2013

Install Dowser Program

"The Dowser program surveys a protein molecule's structure to locate internal cavities and assess the hydrophilicity of these cavities in terms of the energy of interaction of a water molecule with the surrounding atoms."

To install Dowser in a linux machine, you need gfortran compiler. Dowser calls g77 compiler, but apparently is obsolete. You need to create a copy in the /usr/bin path (or where gfortran is) of gfortran named g77 through:

$sudo cp /usr/bin/gfortran /usr/bin/g77

The next step is to download the Dowser program in:

http://danger.med.unc.edu/hermans/dowser/dowser.htm

Extract the file. A folder named dowser will appear. Then:

$cd dowser/

$sudo ./Install

The Dowser program will begin the install process. When it ends, you need to source a file called dowserinit. That file create the variables DOWSER and DOWSER_MACH, and also export the location of the Dowser program to your PATH. That file is written in csh:

setenv DOWSER /tmp_mnt/PROGRAMS/DOWSER

setenv DOW_MACH sgi-mips4

set path = ( $path $DOWSER/bin $DOWSER/bin/$DOW_MACH )

I tried to source that file, but I got problem with the setenv command, because it can't be found. I tried to indicate that the file is in csh putting in the top of the file #! /bin/csh -f, but the commands still can't be founded. So I convert that file in a "bash style":

#!/bin/bash

export DOWSER=/mnt/hgfs/Python/exe_linux/dowser

export DOW_MACH=linux

export PATH=$PATH:$DOWSER/bin:$DOWSER/bin/$DOW_MACH

The final step is to source the dowserinit file through:

$source dowserinit

Thats all. The Dowser program will work fine. More info in

http://danger.med.unc.edu/hermans/dowser/dowser.htm

22 de octubre de 2013

Rank an Output File of Ligands

Sometimes, after you do a docking of your protein of interest with a bunch of ligands, exist the needs of rank this ones by energies of interaction. In Dock6 exist an option in the input file that permit us to do this. The option is called rank_ligands, and only to choose the "yes" option you will get your ligands ranked. But in other cases (like the one happened to me) you forgot to choose that option.

I create a python code that let you to rank the ligands that you want (i.e the first 1000) and gives you an output of only your top ligands. It's somewhat limited, because only accept files in mol2 format and can order molecules that you obtain from ZINC database.

How the program works

The program will ask you 2 questions. The first one is related to the name of your file. Write it without the extension, because the program assume that the file is in mol2 format. The second question is the number of ligands that you want to rank. If you want to rank the better 1000 ligands just write 1000.

Supported format for RankDock

RankDock.py

19 de octubre de 2013

Simple Docking Calculations with Dock6

The molecular docking is a powerful tool that allows us to analyze the orientation of any ligand or protein against some target, which is usually a protein.

Docking of a ligand into the cavity of a protein.

There are several programs for docking calculations. I have personally used the program Dock6. This program takes 3 initial files: Our receptor in .mol2 format, our ligand or ligands in .mol2 format, and a receptor surface file in .dms format. mol2 files must have hydrogens and molecular charge . For this I use the program Chimera, and transformed my PDB files as shown here (http://dock.compbio.ucsf.edu/DOCK_6/tutorials/struct_prep/prepping_molecules.html). dms file can be built with the WriteDMS program included in Chimera.

With these files , the next steps are to create spheres surrounding the receptor, select a group of spheres that represent the binding site, build a box around the binding site, and grid generation. The tutorial mentioned above explains in detail the steps to follow. I built a program called Do_Dock6.py that allows us to perform all these steps in a simple and guided . You only need to have the files of the receptor and ligand, and the dms file, all these generated in Chimera.

How the program works

The program performs certain questions. Answer each question according to the location of your files. I recommend making a folder ( ie " DOCK / " ) where you keep your files mol2 and dms. After you create your spheres (saved in a folder named "SHOWSPH") the next question it is if you want to calculate the grid. Before doing this, you have to open all your sphere files (named like cluster_1, cluster_2, etc) and your box files (saved in a folder named "REC_BOX") in a molecular viewer like PyMol. Search the cluster and the box that fits well with your needs. Then run the program again and choose "N" in the question about sphere generation. When the question about grid calculation appears, choose "Y" and continue.

All box visualization.

Sphere and box visualization.

Do_dock6.py