PDBFixer can be used in three different ways: as a desktop application with a graphical user interface; as a command line application; or as a Python API. This allows you to use it in whatever way best matches your own needs for flexibility, ease of use, and scriptability. The following sections describe how to use it in each of these ways.
conda install -c conda-forge pdbfixer
Alternatively you can install it from source. Download the repository, navigate to the root directory, and typepython setup.py install
This will install the PDBFixer python package as well as the command line program pdbfixer.
Before running PDBFixer, you must first install OpenMM. Follow the installation instructions in the OpenMM manual. It is also recommended that you install CUDA or OpenCL, since the performance will usually be faster than when running on the CPU platform. PDBFixer requires that NumPy be installed. All of these dependencies are included automatically when installing with conda.
pdbfixer
on the command line. PDBFixer displays its user interface through a web browser, but it is still a single user desktop application. It should automatically launch a web browser and open a new window displaying the user interface. If for any reason this does not happen, you can launch a web browser yourself and point it to http://localhost:8000.
The user interface consists of a series of pages for selecting a PDB or PDBx/mmCIF file and choosing what changes to make to it. Depending on the details of a particular file, some of these pages may be skipped.
This page also gives options for removing heterogens. A heterogen is any residue other than a standard amino acid or nucleotide. The possible options are to keep all heterogens; keep only water while deleting all other heterogens; or delete all heterogens.
To get usage instructions for the command line interface, type
pdbfixer --help
This displays the following information:
Usage: pdbfixer
pdbfixer filename [options]
When run with no arguments, it launches the user interface. If any arguments are specified, it runs in command line mode.
Options:
-h, --help show this help message and exit
--pdbid=PDBID PDB id to retrieve from RCSB [default: None]
--url=URL URL to retrieve PDB from [default: None]
--output=FILENAME output pdb file [default: output.pdb]
--add-atoms=ATOMS which missing atoms to add: all, heavy, hydrogen, or
none [default: all]
--keep-heterogens=OPTION
which heterogens to keep: all, water, or none
[default: all]
--replace-nonstandard
replace nonstandard residues with standard equivalents
--add-residues add missing residues
--water-box=X Y Z add a water box. The value is the box dimensions in nm
[example: --water-box=2.5 2.4 3.0]
--ph=PH the pH to use for adding missing hydrogens [default:
7.0]
--positive-ion=ION positive ion to include in the water box: Cs+, K+,
Li+, Na+, or Rb+ [default: Na+]
--negative-ion=ION negative ion to include in the water box: Cl-, Br-,
F-, or I- [default: Cl-]
--ionic-strength=STRENGTH
molar concentration of ions to add to the water box
[default: 0.0]
--verbose Print verbose output
For example, consider the following command line:
pdbfixer --keep-heterogens=water --replace-nonstandard --water-box=4.0 4.0 3.0 myfile.pdb
This will load the file "myfile.pdb". It will add any missing atoms to existing residues, but not add any missing residues (because we did not specify --add-residues). Hydrogens will be added that are appropriate at pH 7.0 (the default). If the file contains any nonstandard amino acids or nucleotides, they will be replaced with the closest equivalent standard ones. Any water molecules present in the file will be kept, but all other heterogens will be deleted. Finally a water box of size 4 by 4 by 3 nanometers will be added surrounding the structure. If necessary, counterions will be added to neutralize it (Na+ or Cl-), but no other ions will be added (because we accepted the default ionic strength of 0.0). After making all these changes, the result will be written to a file called "output.pdb".
PDBFixer is based on OpenMM, and to use its Python API you should first be familiar with the OpenMM API. Consult the OpenMM documentation for details. In everything that follows, I will assume you are already familiar with OpenMM.
To use PDBFixer create a PDBFixer object, passing to its constructor the file to process. You then call a series of methods on it to perform various transformations. When all the transformations are done, you can get the new structure from its topology and positions fields. The overall outline of your code will look something like this:
fixer = PDBFixer(filename='myfile.pdb')
# ...
# Call various methods on the PDBFixer
# ...
PDBFile.writeFile(fixer.topology, fixer.positions, open('output.pdb', 'w'))
There are many different methods you might call in the middle, depending on what you want to do. These methods are described below. Almost all of them are optional, but they must be called in exactly the order listed. You may choose not to call some of the optional methods, but you may not call them out of order.
fixer.removeChains(indices)
where indices is a list of the indices of the chains to remove.
fixer.findMissingResidues()
This method identifies any residues in the SEQRES records for which no atom data is present. The residues are not actually added yet: it just stores the results into the missingResidues field. This is a dictionary. Each key is a tuple consisting of the index of a chain, and the residue index within that chain at which new residues should be inserted. The corresponding value is a list of the names of residues to insert there. After calling this method, you can modify the content of that dictionary to specify what residues should actually be inserted. For example, you can remove an entry to prevent that set of residues from being inserted, or change the identities of the residues to insert. If you do not want any residues at all to be added, you can just write
fixer.missingResidues = {}
The residues actually get added when you call addMissingAtoms(), described below.
fixer.findNonstandardResidues() fixer.replaceNonstandardResidues()findNonstandardResidues() stores the results into the nonstandardResidues field, which is a list. Each entry is a tuple whose first element is a Residue object and whose second element is the name of the suggested replacement residue. Before calling replaceNonstandardResidues() you can modify the contents of this list. For example, you can remove an entry to prevent a particular residue from being replaced, or you can change what it will be replaced with. You can even add new entries if you want to mutate other residues.
fixer.removeHeterogens(False)
The argument specifies whether to keep water molecules. False removes all heterogens including water. True keeps water molecules while removing all other heterogens.
fixer.findMissingAtoms() fixer.addMissingAtoms()findMissingAtoms() identifies all missing heavy atoms and stores them into two fields called missingAtoms and missingTerminals. Each of these is a dictionary whose keys are Residue objects and whose values are lists of atom names. missingAtoms contains standard atoms that should be present in any residue of that type, while missingTerminals contains missing terminal atoms that should be present at the start or end of a chain. You are free to remove atoms from these dictionaries before continuing, if you want to prevent certain atoms from being added.
addMissingAtoms() is the point at which all heavy atoms get added. This includes the ones identified by findMissingAtoms() as well as the missing residues identified by findMissingResidues(). Also, if you used replaceNonstandardResidues() to modify any residues, that will have removed any atoms that do not belong in the replacement residue, but it will not have added ones that are missing from the original residue. addMissingAtoms() is the point when those get added.
fixer.addMissingHydrogens(7.0)
The argument is the pH based on which to select protonation states.
addSolvent(self, boxSize, positiveIon='Na+', negativeIon='Cl-', ionicStrength=0*molar)
boxSize is a Vec3 object specifying the dimensions of the water box to add. The other options specify the types of ions to add and the desired ionic strength. Allowed values for positiveIon are Cs+, K+, Li+, Na+, and Rb+. Allowed values for negativeIon are Cl-, Br-, F-, and I-. For example, the following line builds a 5 nm cube of 0.1 molar potassium chloride:
fixer.addSolvent(Vec3(5, 5, 5)*nanometer, positiveIon='K+', ionicStrength=0.1*molar)
addMembrane(self, lipidType='POPC', membraneCenterZ=0*nanometer, minimumPadding=1*nanometer, positiveIon='Na+', negativeIon='Cl-', ionicStrength=0*molar)
lipidType should be either 'POPC' or 'POPE'. membraneCenterZ is the position along the Z axis at which the center of the membrane will be located. minimumPadding is the minimum acceptable distance between the protein and the edges of the periodic box. All other arguments are the same as for addSolvent(). For example, the following line adds a POPE membrane centered at Z=0 with at least 1 nm of padding on all sides:
fixer.addMembrane('POPE')
from pdbfixer import PDBFixer from openmm.app import PDBFile fixer = PDBFixer(filename='myfile.pdb') fixer.findMissingResidues() fixer.findNonstandardResidues() fixer.replaceNonstandardResidues() fixer.removeHeterogens(True) fixer.findMissingAtoms() fixer.addMissingAtoms() fixer.addMissingHydrogens(7.0) fixer.addSolvent(fixer.topology.getUnitCellDimensions()) PDBFile.writeFile(fixer.topology, fixer.positions, open('output.pdb', 'w'))Suppose you want to keep only the first chain in the file and remove all the others. To do this, call removeChains():
fixer = PDBFixer(filename='myfile.pdb') numChains = len(list(fixer.topology.chains())) fixer.removeChains(range(1, numChains)) fixer.findMissingResidues()Suppose you only want to add missing residues in the middle of a chain, not ones at the start or end of the chain. This can be done by modifying the missingResidues field immediately after calling findMissingResidues():
fixer.findMissingResidues() chains = list(fixer.topology.chains()) keys = fixer.missingResidues.keys() for key in keys: chain = chains[key[0]] if key[1] == 0 or key[1] == len(list(chain.residues())): del fixer.missingResidues[key] fixer.findNonstandardResidues()Suppose that instead of matching the size of the crystallographic unit cell, you want to add a cubic water box large enough to hold the entire system. That is, you want to compute the range of atom positions along each axis, then set the size of the water box to match the largest axis. This can be done as follows:
fixer.addMissingHydrogens(7.0) maxSize = max(max((pos[i] for pos in fixer.positions))-min((pos[i] for pos in fixer.positions)) for i in range(3)) boxSize = maxSize*Vec3(1, 1, 1) fixer.addSolvent(boxSize) PDBFile.writeFile(fixer.topology, fixer.positions, open('output.pdb', 'w'))Suppose you want to replace all nonstandard residues by alanine. findNonstandardResidues() selects a suggested replacement for each residue, but you can modify that before calling replaceNonstandardResidues():
fixer.findNonstandardResidues() fixer.nonstandardResidues = [(residue, 'ALA') for residue, replacement in fixer.nonstandardResidues] fixer.replaceNonstandardResidues()