DOCK 6.12 Users Manual

Principal contributors to the current code:

William Joseph Allen (TACC)
Trent Balius (FNLCR)
John Bickel (SUNY-Stony Brook)
Brock Boysan (SUNY-Stony Brook)
Scott R. Brozell (Rutgers University)
Chris Corbo (SUNY-Stony Brook)
Brian Fochtman (SUNY-Stony Brook)
Lingling Jiang (Columbia University)
P. Therese Lang (UCB)
Guilherme D. R. Matos (SUNY-Stony Brook)
T. Dwight McGee Jr. (SUNY-Stony Brook)
Demetri Moustakas (Harvard)
Sudipto Mukherjee (Temple)
Steven Pak (SUNY-Stony Brook)
Lauren Prentis (SUNY-Stony Brook)
Courtney Singleton (SUNY-Stony Brook)
Yuchen Zhou (SUNY-Stony Brook)

Robert Rizzo (SUNY-Stony Brook)
David Case (Rutgers University)
Brian Shoichet (UCSF)
Irwin Kuntz (UCSF)

For more information about previous contributors, please see the History .

Last updated August 21, 2024

Table of Contents

1. Introduction

1.1. General Overview

1.2. What Can DOCK Do for You

1.3. Installation

1.4. What's New in DOCK 6

1.5. Overview of the DOCK Suite of Programs

2. DOCK

2.1. Overview

2.2. History

2.3. Command-line Arguments

2.4. The Parameter Parser

2.5. Sampling Methods

2.5.1. Rigid and Flexible Ligand Docking

2.5.1.1 Anchor and Grow
2.5.1.2 Identification of Rigid Segments

2.5.1.3 Manual Specifications of Non-Rotatable Bonds

2.5.1.4 Pruning the Conformation Search Tree

2.5.1.5 Time Requirements

2.5.1.6 Growth Tree and Statistics

2.5.1.7 Rigid Body and Flexible Ligand Docking Input Parameters

2.5.2. De novo Design

2.5.2.1. DOCK_DN Input Parameters

2.5.2.2. DOCK_D3N Input Parameters

2.5.3. DOCK_GA: Molecular Evolution using a Genetic Algorithm

2.5.3.1 DOCK_GA Input Parameters

2.5.3.2 DOCK_GA RDKit Input Parameters

2.5.4. Hierarchical DataBase (HDB) Search

2.5.4.1 HDB Input Parameters

2.5.5. Covalent Attach-and-Grow

2.5.5.1 Covalent Input Parameters

2.6. Fragment Library Generation

2.6.1 Input Parameters

2.7. Database Filter

2.7.1 Input Parameters

2.7.2 RDKit Input Parameters

2.8. Ligand RMSD

2.8.1 Input Parameters

2.9. Orienting the Ligand

2.9.1. Sphere Matching

2.9.2. Critical Points

2.9.3. Chemical Matching

2.9.4. Macromolecular Docking

2.9.5. Input Parameters

2.10. Internal Energy Calculation

2.10.5. Input Parameters

2.11. Scoring

2.11.1. Bump Filter

2.11.2. Contact Score

2.11.3. Grid-Based Score

2.11.4. DOCK 3.5 Score

2.11.5. Continuous Score

2.11.6. Zou GB/SA Score

2.11.7. Hawkins GB/SA Score

2.11.8. AMBER Score

2.11.8.1. AMBER Score Binding Energy

2.11.8.2. AMBER Score Receptor Flexibility

2.11.8.3. AMBER Score Inputs

2.11.8.4. AMBER Score Outputs

2.11.8.5. AMBER Score in Practice

2.11.8.6. AMBER Score Parameters

2.11.9. Footprint Score

2.11.10. MultiGrid FPS Score

2.11.11. Pharmacophore Matching Similarity Score

2.11.12. Internal Energy Score

2.11.13. SASA Score

2.11.14. GIST Score

2.11.15. Descriptor Score

2.11.15.1. Tanimoto Score

2.11.15.2. Hungarian Matching Similarity Score

2.11.15.3. Volume Overlap Score

2.12. Minimization

2.13. Miscellaneous Parameters

2.14. Parameter Files

2.14.1. Atom Definition Rules

2.14.2. vdw.defn

2.14.3. chem.defn

2.14.4. chem_match.tbl

2.14.5. ph4.defn

2.14.6. flex.defn

2.14.7. flex_drive.tbl

2.15. Parallel Processing

3. Accessories

3.1. Grid

3.1.1. Overview

3.1.2. Bump Checking

3.1.3. Contact Scoring

3.1.4. Energy Scoring

3.2. Docktools

3.2.1. Chemgrid

3.2.2. Ligand Desolvation

3.2.3. Occupancy Desolvation

3.2.4. Grid Conversion

3.3. Nchemgrids

3.4. Sphgen

3.4.1. Overview

3.4.2. Critical Points

3.4.3. Chemical Matching

3.4.4. Output

3.5. Showbox

3.6. Showsphere

3.7. Sphere Selector

3.8. Antechamber

3.9. tLEaP

3.10. Amber Score Preparation Scripts

4. Molecular File Formats

4.1. Tripos MOL2 Format

4.2. PDB Format

4.3. BILD Format

5. References

6. Acknowledgments

Introduction

RETURN TO TABLE OF CONTENTS

1.1. General Overview

DOCK is molecular modeling program used to identify potential binding geometries and interactions of a molecule to a target. Specifically, docking is the identification of the low-energy binding modes of a small molecule, or ligand, within the active site of a macromolecule, or receptor, whose structure is known. A compound that interacts strongly with, or binds, a receptor associated with a disease may inhibit its function and thus act as a drug. Solving the docking problem computationally requires an accurate representation of the molecular energetics as well as an efficient algorithm to search the potential binding modes.

Historically, the DOCK algorithm addressed rigid body docking using a geometric matching algorithm to superimpose the ligand onto a negative image of the binding pocket. Important features that improved the algorithm's ability to find the lowest-energy binding mode, including force-field based scoring, on-the-fly optimization, an improved matching algorithm for rigid body docking and an algorithm for flexible ligand docking, have been added over the years. For more information on past versions of DOCK, click here.

With the release of DOCK 6, we continue to improve the algorithm's ability to predict ligand binding poses by adding new features like force-field scoring, enhanced solvation models, reference-based scoring options, and de novo design. For more information about the current release of DOCK, click here.

RETURN TO TABLE OF CONTENTS

1.2. What Can DOCK Do for You

We and others have used DOCK for the following applications:

predict binding modes of small molecule-protein complexes

search databases of ligands for compounds that mimic the inhibitory binding interactions of an experimentally validated inhibitor

search databases of ligands for compounds that bind a particular site of a specific protein

search databases of ligands for compounds that bind nucleic acid targets

examine possible binding orientations of protein-protein and protein-DNA complexes

help guide synthetic efforts by examining small molecules that are computationally derived

many more...

RETURN TO TABLE OF CONTENTS

1.3. Installation

DOCK is Unix based scientific software and follows a common installation recipe: download, unpack, configure, build, and test. The simple configuration scheme of DOCK is based on plain text files. Building and testing employ the make command. DOCK installation is so simple and transparent that users have a reasonable chance of correcting problems themselves.

Start with a plain serial installation. Follow the detailed steps (0. through 5.) enumerated below. The appropriate configuration option is likely gnu; see step 3. Subsequently, additional executables can be installed for parallel; see step 6. (Here is a quick start for an example gnu serial and parallel installation: cd install; ./configure gnu; make install; make dockclean; ./configure gnu parallel; setenv MPI_HOME /bla; make dock; make test;).

If problems occur then read the diagnostics carefully and apply the scientific method. Most initial installation problems are due to unavailable or fouled tools, especially compilers; verify that your compilers work before installing DOCK. To observe what's under the hood, view the DOCK configuration file (install/config.h) that is created by configure, especially its troubleshooting section that describes corrective measures for common difficulties. Execute make -n for a dry run. Platform idiosyncrasies can and should be corrected by editing the install/config.h, as opposed to editing the original source file, e.g., install/gnu. Consult the FAQ. Search the DOCK-Fans mailing list archive.

General use of DOCK does not require setting environment variables. However, we recommend the name DOCK_HOME for referencing DOCK6. During installation, paths to some critical locations are hard coded. In version 6.12, some auxiliary scripts were added in directory template_pipeline that do employ DOCK_HOME.

NOTE FOR WINDOWS USERS: DOCK and its accessories must be run using a Unix-like environment such as Cygwin ( http://www.cygwin.com/ ). We recommend a full Unix installation. In particular, when you install your emulator, make sure to also install compilers, Unix shells, and perl ( Devel for Cygwin ). All steps below should be performed using Cygwin or another Unix emulator for Windows. See also the DOCK wiki entry for Cygwin.

(0) Check for Bugfixes online.

(1) Unpack the distribution using the following command:

[user@dock ~] tar -zxvf dock.6.10.tar.gz

(2) Enter the installation directory:

[user@dock ~] cd dock6/install

(3) Configure the Makefile for the appropriate operating system:

[user@dock ~] ./configure [configuration file]

AUTHOR: Scott Brozell

USAGE: configure [-help] [configuration file]

OPTIONS: Notable ones are listed below; for a complete list see the configure -help output.
-help #emit the usage statement
configuration file #input file containing operating system appropriate variables

Configuration Files Target

gnu GNU compilers

gnu.acml recent GNU compilers and ACML

gnu.parallel GNU compilers with parallel processing capability

gnu.parallel.rdkit GNU compilers with parallel processing capability and RDKit capability

gnu.rdkit GNU compilers with RDKit capability

homebrew GNU compilers installed on macOS using the Homebrew package manager

homebrew.rdkit GNU compilers installed on macOS using the Homebrew package manager with RDKit capability

ibmaix IBM AIX and native compilers

intel Intel compilers

intel.mkl Intel compilers and MKL

intel.parallel Intel compilers with parallel processing capability

intel.intelmpi.parallel Intel compilers with parallel processing capability (specific to Intel MPI)

pgi PGI compilers

sgi SGI native compilers

DESCRIPTION:
Create the DOCK configuration file, config.h, by copying an existing configuration file that is selected using the arguments. When invoked without arguments, print this usage statement and if the configuration file exists then print its creation stamp. Some configuration files require that environment variables be defined; these requirements are listed in the files and emitted by configure. Note that as of version 6.6 gfortran is the default Fortran compiler in the gnu config files (replacing g77). In the unlikely case that another Fortran compiler is desired, simply hand edit install/config.h to use the alternative.

(4) Build the DOCK executables via the following command:

[user@dock ~] make all # builds all the DOCK programs

Finer control over which executables are built is available, but is rarely necessary, via one of the following commands:

[user@dock ~] make dock # builds only the dock program
[user@dock ~] make utils # builds only the accessory programs

(5) Test the built executables via this command:

[user@dock ~] make test

The test directory contains the DOCK quality control (QC) suite. It produces pass/fail results via fast regression tests. The suite should complete in less than ten minutes; five minutes is typical. Un-passed tests should be examined to determine their significance. The make test command from the install directory is a shortcut for this sequence: cd test; make test; make check. The make check command executed from the test directory emits all the differences uncovered during testing. The make clean command executed from the test directory removes all files produced during testing; this command is automatically executed by the main make test command above; however, to run tests from a subdirectory of the test directory, one should explicitly execute make clean.
NOTE: Some failures are not significant. For example, differences in the tails of floating point numbers may not be significant. The sources of such differences are frequently platform dependencies from computer hardware, operating systems, and compilers that impact arithmetic precision and random number generators. In addition, the reference outputs as of version 6.4 are from a 64 bit platform and as of version 6.10 use gfortran gcc version 7.5.0, and this can cause false positives on 32 bit platforms or with other compilers; in particular, differing numbers of Orientations or Conformations and different Contact or Grid scores. We are working on increasing the QC suite's resilience to these issues. For now, apply common sense and good judgment to determine the significance of a possible failure. Note that some number of failures is rarely an indication of real problems, but if almost every test fails then something is amiss.

Some features of DOCK (DOCK3.5 Score aka ChemGrid Score) require an electrostatic potential map which is usually generated by DelPhi. Testing of these features requires that the environment variable DELPHI_PATH be defined to the full path of the DelPhi executable. DelPhi is not distributed with DOCK; see also Wikipedia. Qnifft may now be used to calculate the electrostatic potential map (as is done with DOCK 3.7 and 3.8), instead of Delphi. Testing of these features with Qnifft requires that the environment variable QNIFFT_PATH be defined (and not DELPHI_PATH) to the full path of the Qnifft executable (this executable is available with DOCK 3.7).

DOCK with parallel processing capability will be automatically tested by the QC suite if dock6.mpi has been built. The same environment variable, MPI_HOME, needed for compilation should be identically defined for testing. Optionally, the environment variable DOCK_PROCESSES can be set to control how many MPI processes are tested. See step 6 below for details on building dock6.mpi.

(6) OPTIONAL: Additional dock executables.

(i) DOCK with parallel processing capability requires a Message Passing Interface (MPI) library. Because of the vagaries of MPI libraries, building parallel DOCK has more pitfalls than installing the serial version. The MPI library must be installed and running on the system if the parallel features of DOCK are to be used.
The DOCK installation mechanism supports all MPI implementations. The focus on MPICH2 and MPICH was eliminated in version 6.10. Once MPI is installed, define the environment variable MPI_HOME to the top level of the MPI directory. MPI_HOME will be referenced by all stages of the build procedure - from configuration through testing. See the Parallel DOCK section for execution information.

NOTE: The parallel configuration files should support any MPI installation even though they were initially tailored to MPICH. Linking problems, such as undefined references and cannot find libbla_bla, can occur due to idiosyncrasies in the MPI installation. One corrective approach is to use manual linking; edit your config.h to add to the LIBS definition the link flags (-L and -l) from the command: $MPI_HOME/mpicc -show; in general, the LIBS should contain those link flags in the same order.

(ii) As of version 6.11, DOCK can be compiled with RDKit. However, RDKit is not packaged with DOCK. It must be downloaded and built separately to eventually be compiled with DOCK. Importantly, for consistency with published work on DOCK6 Using RDKit, Matos et al. J. Chem. Inf. Model. 2023, we recommend installing RDKit Release 2019.09.1 which depends on Boost 1.71.0 and Eigen 3.3.9. If you already have this RDKit installed then you can build a DOCK executable with support for RDKit by first defining these environment variables (using Bourne shell syntax):

export BOOST="/path/to/boost/root/dir/"
export RDBASE="/path/to/rdkit/root/dir"
export LD_LIBRARY_PATH="${BOOST}/lib:${RDBASE}/lib:${LD_LIBRARY_PATH}"

and then running the following commands for a serial executable (or their equivalent for a parallel executable):

cd install; make distclean; ./configure gnu.rdkit; make dock; make test;.

If you don't have RDKit installed then there are several approaches: The simplest may be to use the automated recipe supplied by DOCK; see below. The other approach is to manually install RDKit yourself. This path has several options; for instructions please refer to this RDKit website, and consult the How is RDKit installed ? DOCK FAQ for a Python-less approach.

The DOCK automated recipe for installing RDKit requires that you have anaconda/miniconda installed. After the installation of anaconda/miniconda, you must export the root path of anaconda/miniconda as "CONDABASE" in your .bashrc or equivalent file, e.g.:

export CONDABASE="/path/to/miniconda3"

Then you can invoke DOCK's recipe for installing RDKit as shown below:
[user@dock ~] cd ./install
[user@dock ~] make rdkit # downloads RDKit and its dependencies, then compiles. Please follow the instructions shown in the screen.

If you need to clean out the RDKit compiled contents, please execute the command below. Once you execute that command then you will not be able to compile DOCK with RDKit unless you rebuild RDKit:

[user@dock ~] make rdkitclean # to clean out everything pertaining to RDKit.
(7) Build the DOCK executables via Docker

In the 6.12 DOCK6 release, DOCK6 can now be compiled in Docker images (no relation to DOCK6). To start, you must have the Docker engine installed your local computer. The Docker engine is necessary to generate the images from Docker files, then to run to DOCK6 program and tools in a Docker container.
Instructions on how to Dockerize

cd in install/docker

./dockerize Dockerfile.gnu_w_parallel. This will start dockerizing the dock6 build.

Then the dock6-suite.docker "binary" will be generated in the bin folder. Please be patient, this will take awhile.

Example Usage

Non-interactive mode:

dock6-suite.docker -b "dock6 -i dn.in -o dn.out"

dock6-suite.docker -b "grid -i grid.in -o grid.out"

dock6-suite.docker -b "dock6 -i dn.in -o dn.out" -v /PATH/TO/FILE1 -v ../../PATH/TO/FILE2 -v /PATH/TO/DIR

dock6-suite.docker -b "mpirun -n 4 dock6.mpi -i dn.in -o dn.out"

Interactive mode:

dock6-suite.docker

dock6-suite.docker -v /PATH/TO/FILE1 -v ../../PATH/TO/FILE2 -v /PATH/TO/DIR

Usage flags
Usage: dock6-suite.docker [ -h help ] [ -b BINARY CALL WITH ARGUMENTS ] [ -v mounting files and folders ]
Interactive Mode Usage: dock6-suite.docker [ -h help ] [ -v mounting files and folders ]
Available binary calls
am1bcc
amberize_complex
amberize_ligand
amberize_receptor
antechamber
atomtype
bondtype
chemgrid
dock6
dock6.mpi
espgen
grid
grid-convert
grid-convrds
make_phimap
mopac
nchemgrid_GB
nchemgrid_SA
parmcal
parmchk
prepgen
resp
respgen
sevsolv
showbox
showsphere
solvgrid
solvmap
sphere_selector
sphgen
teLeap
tleap
mpirun
Caveats

When you use the -v flag, you have to repeat this flag for every file you want to mount. If you want to mount multiple files in folder, you can target the folder path.

By default, the path you call dock6-suite.docker at will be mounted to the container.

For non-interactive mode, make sure that input parameters that are referencing input files is just the name of the file. This is because the file is mounted to the path /app/workspace in the Docker container. Or you can prepend all files you mounted with /app/workspace

If you like to use logical paths, you can use interactive mode to mount folders and files. Then, you can start making your own dock6 input file with logical paths.

In non-interactive mode, you MUST enclose all binary calls within "".

Unfortunately, right now calling dock6-suite.docker -b "dock6 -i dock6.in", won't allow you to input parameters manually one by one. The dock.in file must be filled out first. If you want to have this behavior you must use interactive mode.

The wrapper functions that hide away the Docker calls are assuming that your host environment is in a linux based system. This is because it requires #!/bin/bash in your environment.
RETURN TO TABLE OF CONTENTS

1.4. What's New in DOCK 6

Version 6.0

The new features of DOCK 6 include: additional scoring options during minimization; DOCK 3.5 scoring-including Delphi electrostatics, ligand conformational entropy corrections, ligand desolvation, receptor desolvation; Hawkins-Cramer-Truhlar GB/SA solvation scoring with optional salt screening; PB/SA solvation scoring; AMBER scoring-including receptor flexibility; the full AMBER molecular mechanics scoring function with implicit solvent; conjugate gradient minimization and molecular dynamics simulation capabilities.

Version 6.1

The newly added features for this incremental release of DOCK 6 include a new pruning algorithm during the anchor-and-grow algorithm, a distance-based movable region and a mildly performance optimized nothing movable region for AMBER score, cleaner output and more complete output files for AMBER score, the ability to perform ranking and/or clustering on ligands between primary and secondary scoring, and more dynamic output when secondary scoring is employed.

Version 6.2

The newly added features for this incremental release of DOCK 6 include greater control over the output of conformations, improved memory efficiency for grid reading, a distance dependent dielectric control for continuous score, and for AMBER score better error reporting and robustness of the preparation scripts, a metal ions library, a cofactor library, a hook for a user library, support for RNA receptors, a minimization convergence criterion control, and the ability to skip inadequately prepped ligands.

Version 6.3

The newly added features for this incremental release of DOCK 6 include more robust input file processing, support for OpenEye Toolkits version 1.7.0 for PB/SA score, and for AMBER score improved support for RNA receptors, the option to use the existing ligand charges during preparation, and better error reporting and robustness of the preparation scripts. In particular, for AMBER scoring of RNA receptors, the distance movable region can be applied with explicit waters and the preparation can neutralize to a total charge of zero and can solvate with water. See Graves et al., 2008

Version 6.4

The newly added features for this incremental release of DOCK 6 include: resolving ligand internal clashes of flexible ligands (more than seven rotatable bonds) by inclusion of an internal energy function at all stages of growth; an ability to output growth trees as multi mol2 files; printing of growth statistics in the dock output file; restrained minimization with an RMSD tether, a torsion pre-minimizer.

Version 6.5

The newly added features for this incremental release of DOCK 6 include: Now an anchor can be chosen by specifying an atom in that fragment. In addition, the number of anchors used can be limited during multi-anchor docking. The new scoring function called footprint score(the old descriptor score) has been introduced, which includes a hydrogen bond term and footprint similarity scoring. See Balius et al. PB/SA score has undergone some generalizations and efficiency improvements that make docking, as opposed to rescoring, more tractable for nontrivial systems. For AMBER score the cofactors library, leaprc.dock.cofactors, and the ions library, leaprc.dock.ions, have grown substantially.

Version 6.6

The newly added features for this incremental release of DOCK 6 include: A new grid-based footprint scoring function, a SASA-based scoring function, calculation of RMSD using the Hungarian Algorithm, and inclusion of orienting statistics.

Version 6.7

This incremental release of DOCK 6 includes updates to the default values for several input parameters based mainly on a performance assessment by Allen et al. using large data sets and employing multiple metrics.

Version 6.8

The newly added features for this incremental release of DOCK 6 include: A new pharmacophore-based similarity scoring function by Jiang et al., a Tanimoto scoring function, a Hungarian Matching Similarity scoring function, a volume-based similarity scoring function, and a hybrid "descriptor" score to combine component scoring functions in DOCK.

Version 6.9

New features include an enhanced chemical searching method termed: de novo DOCK (DOCK_DN), which is a de novo design method that can be used to construct molecules from scratch or to modify existing molecular frameworks (see Allen et al.). In addition, a new fragment library generator function was added to the docking protocol, and is accessible through flexible ligand docking.

Version 6.10

New features include: an enhanced chemical searching methods termed: molecular evolution DOCK (DOCK_GA), which is an evolution-based method for ligand construction that employs principles of breeding and mutations (see Prentis et al.), a new fragment library generation function was added to the docking protocol, a simplex minimization step ramping functionality for enhanced speed during docking, a new scoring function (internal energy score) that allows for generation and scoring of molecules without a protein, and a molecular weight smoothing function for de novo design that will allow a softer curve of weight distributions in the final ensemble. Secondary score, introduced in 6.1, has been fully removed in this version.

Version 6.11

New features include the integration of the open-source toolkit RDKit with the DOCK6 codebase, allowing users to calculate important drug-based descriptors for molecules. On top of the ability to calculate these descriptors, DOCK6.11 include an enhanced version of DOCK_DN, termed "descriptor-driven de novo design" (DOCK_D3N). This method allows users to specify descriptors (and their target ranges) to bias the on-the-fly molecular construction. This powerful and flexible routine tailors ligand growth towards desirable regions of chemical space. Further, RDKit is utilized in a wide array of DOCK features to calculate descriptors for resulting molecules. (DOCK_GA, Database Filtering, Rigid/Flexible Docking).

Version 6.12

New features include: An implementation of Hierarchical DataBase (HDB) search method to enable large scale docking. HDB search allows us to dock db2 files as is done with DOCK3.7 and 3.8. An update to DOCK3.5 Score (also called ChemGrid_Score) to be compatible with DOCK3.7 and 3.8 scoring. This includes using QNIFFT instead of Delphi. A new scoring function called GIST_Score, (with three scoring options) which accounts for receptor desolvation. Both GIST and DOCK3.5 Scoring functions are now available in descriptor_score, so they can be combined with other methods within descriptor score. (Balius, J.Comput. Chem. 2024). A covalent docking algorithm called attach-and-grow has been added.

DOCK_DN has been rewritten to allow for a "parallel" pruning methodology between each layer, greatly increasing construction efficiency. A final Tanimoto comparison step has been added when molecules are written to file as well, effectively removing duplicate molecules from a given DN run. The output of DOCK_DN has been overhauled to include significantly more information about each step of growth, providing an easy-to-read description of each run. The memory footprint of DN runs has been greatly improved, with pruned molecules being written out to prune_dump files on-the-fly rather than at the end of each layer, or discarded once pruned. An option to Grid Score has been added to instead utilize Ligand Efficiency (Grid Score/# active heavy atoms) as the scoring function during any given run.

DOCK6 can now be containerizable by using Docker. This will help with installing programs that are sometimes difficult to compile. The wrapper funcitons were created to ease the process of generating Docker images. Further, the user can call the dock6 binary and tools in interactive and non-interactive mode. Note that this feature can only be used if your local environment has the bash scripting environment and the Docker engine. Look at the installation section for more information.

RETURN TO TABLE OF CONTENTS

1.5. Overview of the DOCK Suite of Programs

1.5.1. Programs

The relationship between the main programs in the dock suite is depicted in Figure 1. These routines will be described below.

Main programs in DOCK suite

The program sphgen identifies the active site, and other sites of interest, and generates the sphere centers that fill the site. It has been described in the original paper (Kuntz et al. J. Mol. Biol. 1982). The program grid generates the scoring grids (Shoichet et al. J. Comp. Chem. 1992 and Meng et al. J. Comp. Chem. 1992). Within the DOCK suite of programs, the program DOCK matches spheres (generated by sphgen) with ligand atoms and uses scoring grids (from grid) to evaluate ligand orientations (Kuntz et al. J. Mol. Biol. 1982 and Shoichet et al. J. Comp. Chem. 1992). Program DOCK also minimizes energy based scores (Meng et al. Proteins 1993).

1.5.2. General Concepts

The DOCK suite of programs is designed to find favorable orientations of a ligand in a “receptor.” It can be subdivided into

(i) those programs related directly to docking of ligands and
(ii) accessory programs.

We limit the discussion in this section to only those programs and methods related to docking a ligand in a receptor. A typical receptor might be an enzyme with a well-defined active site, though any macromolecule may be used (e.g. a structural protein, a nucleic acid strand, a “true” receptor). We’ll use an enzyme as an example in the rest of this discussion.

The starting point of all docking calculations is generally the crystal or NMR structure of an enzyme from an enzyme-ligand complex. The ligand structure may be taken from the crystal structure of the enzyme-ligand complex or from a database of compounds, such as the ZINC database (Irwin, et. al. J. Chem. Inf. Model. 2005). The primary consideration in the design of our docking programs has been to develop methods which are both rapid and reasonably accurate. These programs can be separated functionally into roughly two parts, each somewhat independent of the other:

(i) Routines which determine the orientation of a ligand relative to the receptor and
(ii) Routines which evaluate (score) a ligand orientation.

There is a lot of flexibility. You can generate orientations outside of DOCK and score them with the DOCK evaluation functions. Alternatively, you can develop your own scoring routines to replace the functions supplied with DOCK.

The ligand orientation in a receptor site is broken down into a series of steps, in different programs. First, a potential site of interest on the receptor is identified. (Often, the active site is the site of interest and is known a priori.) Within this site, points are identified where ligand atoms may be located. A routine from the DOCK suite of programs identifies these points, called sphere centers, by generating a set of overlapping spheres which fill the site. Rather than using DOCK to generate these sphere centers, important positions within the active site may be identified by some other mechanism and used by DOCK as sphere centers. For example, the positions of atoms from the bound ligand may be used as these sphere centers. Or, a grid may be generated within the site and each grid point may be considered as a sphere center. Our sphere centers, however, attempt to capture shape characteristics of the active site (or site of interest) with a minimum number of points and without the bias of previously known ligand binding modes.

To orient a ligand within the active site, some of the sphere centers are “matched” with ligand atoms. That is, a sphere center is “paired” with an ligand atom. Many sets of these atom-sphere pairs are generated, each set containing only a small number of sphere-atom pairs. In order to limit the number of possible sets of atom-sphere pairs, a longest distance heuristic is used; (long) inter-sphere distances are roughly equal to the corresponding (long) inter-atomic ligand distances. A set of atom-sphere pairs is used to calculate an orientation of the ligand within the site of interest. The set of sphere-atom pairs which are used to generate an orientation is often referred to as a match. The translation vector and rotation matrix which minimizes the rmsd of (transformed) ligand atoms and matching sphere centers of the sphere-atom set are calculated and used to orient the entire ligand within the active site.

The orientation of the ligand is evaluated with a shape scoring function and/or a function approximating the ligand-enzyme binding energy. Most evaluations are done on (scoring) grids in order to minimize the overall computational time. At each grid point, the enzyme contributions to the score are stored. That is, receptor contributions to the score, potentially repetitive and time consuming, are calculated only once; the appropriate terms are then simply fetched from memory.

The ligand-enzyme binding energy is taken to be approximately the sum of the van der Waals attractive, van der Waals dispersive, and Coulombic electrostatic energies. Approximations are made to the usual molecular mechanics attractive and dispersive terms for use on a grid. To generate the energy score, the ligand atom terms are combined with the receptor terms from the nearest grid point, or combined with receptor terms from a “virtual” grid point with interpolated receptor values. The score is the sum of over all ligand atoms for these combined terms. In this case, the energy score is determined by both ligand atom types and ligand atom positions on the energy grids.

As a final step, in the energy scoring scheme, the orientation of the ligand may be varied slightly to minimize the energy score. That is, after the initial orientation and evaluation (scoring) of the ligand, a simplex minimization is used to locate the nearest local energy minimum. The sphere centers themselves are simply approximations to possible atom locations; the orientations generated by the sphere-atom pairing, although reasonable, may not be minimal in energy.

1.5.3. Specific Concepts

(A) Sphere Centers

Spheres are generated to fill the target site. The sphere centers are putative ligand atom positions. Their use is an attempt to limit the enormous number of possible orientations within the active site. Like ligand atoms, these spheres touch the surface of the molecule and do not intersect the molecule. The spheres are allowed to intersect other spheres; i.e., they have volumes which overlap. Each sphere is represented by the coordinates of its center and its radius. Only the coordinates of the sphere centers are used to orient ligands within the active site (see above). Sphere radii are used in clustering.

The number of orientations of the ligand in free space is vast. The number of orientations possible from all sets of sphere-atom pairings is smaller but still large and cannot be generated and evaluated (scored) in a reasonable length of time. Consequently, various filters are used to eliminate from consideration, before evaluation, sets of sphere-atoms pairs, which will generate poorly scoring orientations. That is, only a small subset of the number of possible ligand orientations are actually generated and scored. The distance tolerance is one filter. Sphere “coloring” and identification of “critical” spheres are other filters.

Sphere-sphere distances are compared to atom-atom distances. Sets of sphere-atom pairs are generated in the following manner: sphere i is paired with atom I if and only if for every sphere j in the set and for every atom J in the set,

where dij is the distance between sphere i and sphere j, dIJ is the distance between atom I and atom J, and epsilon is a somewhat small user-defined value.

(B) Chemical Matching

DOCK spheres are generated without regard to the chemical properties of the nearby receptor atoms. Sphere “chemical matching” or “coloring” associates a chemical property to spheres and a sphere of one “color” can only be matched with a ligand atom of complementary color. These chemical properties may be things such as “hydrogen-bond donor,” “hydrogen-bond acceptor,” “hydrophobe,” “electro-positive,” “electro-negative,” “neutral,” etc. Neither the colors themselves, nor the complementarity of the colors, are determined by the DOCK suite of programs; DOCK simply uses these labels. With the inclusion of coloring, only ligand atoms with the appropriate chemical properties are matched to the complementary colored spheres. It is probably more likely, then, that the orientation generated will produce a favorable score. Conversely, by excluding colored spheres from pairing with certain ligand atoms, the number of (probably) unfavorable orientations which are generated and evaluated can be reduced. Note that requiring complementarity in matching does not mean that all ligand atoms will lie in chemically complementary regions of the enzyme. Rather, only those ligand atoms, when paired with a colored sphere which is part of the sphere-atom match, will be guaranteed to be in the chemically complementary region of the enzyme (provided chirality of the spheres is the same as that of the matching ligand atoms).

(C) Critical Points

The "critical point" filter requires that certain spheres be part of the set of sphere-atom pairs used to orient the ligand (DesJarlais et al. J. Comput-Aided Molec. Design. 1994). Designating spheres as critical points forces the ligand to have at least one atom in that area of the enzyme, where that sphere is located. This filter may be useful, for example, when it is known that a ligand must occupy a particular area of an active site. This filter removes from consideration any orientation that does not guarantee at least one ligand atom in critical areas of the enzyme (provided chirality of the spheres is the same as that of the matching ligand atom).

(D) Bump Filter

After a ligand is oriented within the active site, the orientation is evaluated. In an attempt to reduce the total computational time, after the ligand is oriented in the site, it is possible to first check whether or not ligand atoms occupy space already occupied by the receptor. If too many of such “bumps” are found, then the ligand is likely to intersect the receptor even after minimization; consequently, the ligand orientation is discarded before evaluation.

1.5.4. Units

The units of the DOCK suite of programs are lengths in angstroms, masses in atomic mass units, charges in electron charges units, and energies in kcal/mol. For Amber score internally and on input of charges from a prmtop file the charges are scaled by 18.2223.

RETURN TO TABLE OF CONTENTS

DOCK

RETURN TO TABLE OF CONTENTS

2.1. Overview

This section is intended as a reference manual for the features of the DOCK Suite of Programs. It is intended to give an overview of the ideas which form the basis of the DOCK suite of programs and to detail the available user parameters. It is not intended to be a substitute for all the papers written on DOCK.

In general, this document is geared towards the experienced user and introduces new features and concepts in version 6. If you are new to DOCK, we strongly recommend you look at the tutorials on the DOCK web site at http://dock.compbio.ucsf.edu/DOCK_6/index.htm, which go into much greater practical detail.

RETURN TO TABLE OF CONTENTS

2.2. History

Version 1.0/1.1

Authors: Robert Sheridan, Renee DesJarlais, Irwin Kuntz

The program DOCK is an automatic procedure for docking a molecule into a receptor site. The receptor site is characterized by centers, which may come from sphgen or any other source. The molecule being docked is characterized by ligand centers, which may be its non-hydrogen atoms or volume-filling spheres calculated in sphgen. The ligand centers and receptor centers are matched based on comparison of ligand-center/ligand-center and receptor-center/receptor-center distances. Sets of ligand centers match sets of receptor centers if all the internal distances match, within a value of distance_tolerance. Ligand-receptor pairs are added to the set until at least nodes_minimum pairs have been found. At least three pairs must be found to uniquely determine a rotation/translation matrix that will orient the ligand in the receptor site. A least-squares fitting procedure is used (Ferro et al. Act. Cryst. A. 1977). Once an orientation has been found, it is evaluated by any of several scoring functions. DOCK may be used to explore the binding modes of an individual molecule, or be used to screen a database of molecules to identify potential ligands.

Version 2.0

Authors: Brian Shoichet, Dale Bodian, Irwin Kuntz

DOCK version 2.0 was written to give the user greater control over the thoroughness of the matching procedure, and thus over the number of orientations found and the CPU time required (Shoichet et al. J. Comp. Chem. 1992). In addition, certain algorithmic shortcomings of earlier versions were overcome. Versions 2.0 and higher are particularly useful for macromolecular docking (Shoichet et al. J. Mol. Biol. 1991) and applications which demand detailed exploration of ligand binding modes. In these cases, users are encouraged to run CLUSTER in conjunction with sphgen and DOCK.

To allow for greater control over searches of orientation space, the ligand and receptor centers are pre-organized according to their internal distances. Starting with any given center, all the other centers are presorted into “bins” based on their distance to the first center. All centers are tried in turn as “first” positions, and all the points in a bin which has been chosen for matching are tried sequentially. Ligand and receptor bins are chosen for matching when they have the same distance limits from their respective “first” points. The number of centers in each bin determines how many sets of points in the receptor and the ligand will ultimately be compared. In general, the wider the bins, the greater the number of orientations generated. Thus, the thoroughness of the search is under user control.

Version 3.0

Authors: Elaine Meng, Brian Shoichet, Irwin Kuntz

Version 3.0 retained the matching features of version 2.0, and introduced options for scoring (Meng et al. J. Comp. Chem., 1992). Besides the simple contact scores mentioned above, one can also obtain molecular mechanics interaction energies using grid files calculated by CHEMGRID (which is now superseded by GRID in version 4.0). More information about the ligand and receptor molecules is required to perform these higher-level kinds of scoring. Point charges on the receptor and ligand atoms are needed for electrostatic scoring, and atom-type information is needed for the van der Waals portion of the force field score. Input formats (some of them new in version 3.5) are discussed in various parts of the documentation; one example of a “complete format” (including point charges and atom type information) is SYBYL MOL2 format. Parameterization of the receptor is discussed in the documentation for CHEMGRID. In DOCK, ligand parameters are read in along with the coordinates; input formats are described below. Currently, the options are: contact scoring only, contact scoring plus Delphi electrostatic scoring, and contact scoring plus force field scoring. Atom-type information and point charges are not required for contact scoring only.

Version 3.5

Authors: Mike Connolly, Daniel Gschwend, Andy Good, Connie Oshiro, Irwin Kuntz

Version 3.5 added several features: score optimization, degeneracy checking, chemical matching and critical clustering.

Version 4.0

Authors: Todd Ewing, Irwin Kuntz

Version 4.0 was a major rewrite and update of DOCK (Ewing et al. 2001 ). A new matching engine was developed which is more robust, efficient, and easier to use (Ewing and Kuntz. J. Comput. Chem. 1997). Orientational sampling can now be controlled directly by specifying the number of desired orientations. Additional features include chemical scoring, chemical screening, and ligand flexibility.

Version 5.0-5.4

Authors: Demetri Moustakas, P. Therese Lang, Scott Pegg, Scott Brozell, Irwin Kuntz

Version 5 was rewritten in C++ in a modular format, which allows for easy implementation of new scoring functions, sampling methods and analysis tools (Moustakas et al., 2006). Additional new features include MPI parallelization, exhaustive orientation searching, improved conformation searching, GB/SA solvation scoring, and post-screening pose clustering. (Zou et al. J. Am. Chem. Soc., 1999)

Version 6.0-6.11

Authors: P. Therese Lang, Demetri Moustakas, Scott Brozell, Noel Carrascal, Sudipto Mukherjee, Lauren Prentis, Courtney Singleton, Yuchen Zhou, Brian Fochtman, Trent Balius, T. Dwight McGee Jr., William Joseph Allen, John Bickel, Guilherme D. R. Matos, Steven Pak, Christopher Corbo, Brock Boysan, Patrick Holden, Scott Pegg, Kaushik Raha, Devleena Shivakumar, Robert Rizzo, David Case, Brian Shoichet, Irwin Kuntz

DOCK 6 is an extension of the DOCK 5 code base. It includes the implementation of Hawkins-Cramer-Truhlar GB/SA solvation scoring with salt screening and PB/SA solvation scoring through OpenEye's Zap Library. Additional flexibility has been added to scoring options during minimization. The new code also incorporates DOCK version 3.5.54 scoring features like Delphi electrostatics, ligand desolvation, and receptor desolvation. Finally, DOCK 6 introduces new code that allows access to the NAB library of functions such as receptor flexibility, the full AMBER molecular mechanics scoring function with implicit solvent, conjugate gradient minimization, and molecular dynamics simulation capabilities. The most recent version of DOCK 6 includes novel searching methods (DOCK_DN, DOCK_GA) and the ability to create fragment libraries. DOCK_D3N depends on an enhanced version of DOCK by integrating RDKit. See Lang et al. RNA, 2009, Brozell et al., 2012, Allen et al., 2015, Allen et al., 2017, and Prentis et al., 2022.

RETURN TO TABLE OF CONTENTS

2.3. Command-line Arguments

DOCK must be run from the command line in a standard unix shell. It reads an input parameter file containing field/value pairs:

USAGE: dock6 -i dock.in [-o dock.out] [-v]

DESCRIPTION:
DOCK may be executed in either interactive or batch mode, depending on whether the output is written to a file. In interactive mode, the user is requested only for parameters relevant to the particular run and default values are provided. This mode is recommended for the initial construction of the input file and for short calculations. In batch mode, input parameters are read in from the input file and all output is written to the output file. This mode is recommended for long calculations once an input file has been generated interactively.

OPTIONS
-i dock.in #input file containing user-defined parameters
-help #emit the usage statement.
-v #verbosity flag that prints additional information and warnings for scoring functions
-o dock.out #output file containing the parameters used in the calculation, summary information for each molecule docked, and all warning messages

Interactive mode

USAGE: dock6 -i dock.in

DESCRIPTION:
When launched this way, DOCK will extract all relevant parameters from dock.in (or any file supplied by the user). If additional parameters are needed (or if the dock.infile is non-existent or empty), DOCK will request them one at a time from the user. Reasonable default values are presented. Any parameters supplied by the user will be automatically appended to the dock.in file. If the user would like to change any previously entered values, the user can edit the dock.in file using a text editor.

Batch mode

USAGE: dock6 -i dock.in -o dock.out

DESCRIPTION:
When launched in this way, DOCK will run in batch mode, extracting all relevant parameters from dock.in (or any file supplied by the user) and will write out all output to dock.out (or any file supplied by the user). If any parameters are missing or incorrect, then execution will halt and an appropriate error message will be reported in dock.out.

Parallel DOCK

USAGE: mpirun [-machinefile machfile] [-np #_of_proc] dock6.mpi -i dock.in -o dock.out [-v]

DESCRIPTION:
If DOCK has been built for parallel processing (see Installation) then DOCK can be run in parallel. Parallelization employs a single master processor with the remaining processors acting as slaves. If np = 1, the code defaults to non-MPI behavior. There is a minimal difference in performance between 1 and 2 processors. Improved performance is only evident with more than 2 processors. In some MPIs dock6.mpi should be launched with mpiexec.

ADDITIONAL OPTIONS:
-machinefile #simple text file containing the names of the computers (nodes) to be used
-np # specifies the number of processors which typically is the same as the number of lines in the machinefile
For additional details on your MPI installation, read the man page:
man mpirun

RETURN TO TABLE OF CONTENTS

2.4. The Parameter Parser

In Interactive Mode, dock will dynamically ask the user to enter the appropriate user parameters. The generic format for the questions is:

parameter_name [default value] (legal values):

The parameter parser requires that the values entered for a parameter exactly match one of the legal values. For example:

Example A: program_location [Hello_World!] ():

Example B: #_red_balloons [99] ():

Example C: glass_status [half_full] (half_full half_empty):

In Example A, the parameter "program_location" can be assigned any string value, and in Example B, the parameter "#_red_balloons" can be assigned any integer value. However, in Example C, the parameter value "glass_status" can only be assigned the strings "half_full" or "half_empty". If no parameter are assigned by the user, the default value--in brackets--will be used.

In Batch Mode, all parameters in the dock.in file, must be:

parameter_name value

Note that the parameter_name and corresponding value must be separated by white space, namely, blanks or tabs.

RETURN TO TABLE OF CONTENTS

2.5. Sampling Methods

Before you can dock a ligand, you will need atom types and charges for every atom in the ligand. Currently, DOCK reads the Tripos MOL2 format. For a single ligand (or several ligands), you can use Chimera in combination with antechamber to prepare a MOL2 file for the ligand (see Structure Preparation Tutorial) or various other visualization packages. During the docking procedure, ligands are read in from a single MOL2 or multi-MOL2 file. Atom and bond types are assigned using the DOCK 4 atom/bond typing parameter files.

For hierarchical database search, dock will read in db2 files which efficiently store multiple ligand conformations.

Many new sampling methods have been fully integrated into DOCK6, where users will be able to access powerful virtual screening capabilities, from scratch de novo growth (DOCK_DN), and evolution-based searching (DOCK_GA). We have added hierarchical database search (HDB) which performs hierarchical traversal through precomputed ligand conformations to enable large-scale docking. (Balius, J.Comput. Chem. 2024) We also now have a covalent docking method called attach-and-grow (this method is still in development).

Sampling Method Input Parameter

Parameter Description Default Value

conformer_search_type Choose the type of docking calculation: Rigid body docking (rigid), Flexible ligand docking (flex), de novo ligand design (denovo), Genetic Algorithm ligand construction (genetic), hierarchical database search (HDB), attach-and-grow (covalent) flex

RETURN TO TABLE OF CONTENTS

2.5.1. Rigid and Flexible Ligand Docking

The internal degrees of freedom of the ligand can be sampled with the anchor-and-grow incremental construction approach. This conformational search algorithm has been validated for binding mode prediction using a large dataset derived from the protein data bank of 1043 protein-ligand complexes (Allen et al., 2015).

2.5.1.1. Anchor-and-Grow

The process of docking a molecule using the anchor-first strategy is shown in the Workflow for Anchor-and-Grow Algorithm Ewing et al. 2001 . First, the largest rigid substructure of the ligand (anchor) is identified (see Identification of Rigid Segments) and rigidly oriented in the active site (orientation) by matching the center of the heavy atoms to that of the receptor spheres (see Orienting the Ligand). The anchor orientations are evaluated and optimized using the scoring function (see Scoring) and the energy minimizer (see Minimization). In general, the orientations are then ranked according to their score, spatially clustered by heavy atom root mean squared deviation (RMSD), and pruned (see Pruning the Conformation Search Tree). Next, the remaining flexible portion of the ligand (see Identification of Flexible Layers) is built onto the best anchor orientations within the context of the receptor (grow). It is assumed that the shape of the binding site will help restrict the sampling of ligand conformations to those that are most relevant for the receptor geometry.

Workflow for Anchor-and-Grow Algorithm

Starting with version 6.8, ligand conformational searching is enabled when the conformer_search_type input parameter is set to flex. In versions 6.7 and earlier of DOCK, the corresponding input parameter was flexible_ligand [yes] (yes no). Only the torsion angles are modified, not the bond lengths or angles. Therefore, the input geometry of the molecule needs to be of good quality. A structure generated by ZINC15 is sufficient.

The torsion angle positions reside in an editable file (see flex_drive.tbl on page 111) which is identified with the flex_drive_file parameter. Internal clashes are detected during the torsion drive search based on the clash_overlap or internal_energy parameters, which are independent of scoring function.

RETURN TO TABLE OF CONTENTS

2.5.1.2. Identification of Rigid Segments

A flexible molecule is treated as a collection of rigid segments. Each segment contains the largest set of adjacent atoms separated by non-rotatable bonds. Segments are separated by rotatable bonds.

The first step in segmentation is ring identification. All bonds within molecular rings are treated as rigid. This classification scheme is a first-order approximation of molecular flexibility, since some amount of flexibility can exist in non-aromatic rings. To treat such phenomenon as sugar puckering and chair-boat hexane conformations, the user will need to supply each ring conformation as a separate input molecule. Additional bonds may be specified as rigid by the user (see Manual Specification of Non-rotatable Bonds).

Identification of Rigid Anchor and Flexible Bonds

The second step is flexible bond identification. Each flexible bond is associated with a label defined in an editable file (see flex.defn). The parameter file is identified with the flex_definition_file parameter. Each label in the file contains a definition based on the atom types (and chemical environment) of the bonded atoms. Each label is also flagged as minimizable. Typically, bonds with some degree of double bond character are excluded from minimization so that planarity is preserved. Each label is also associated with a set of preferred torsion positions. The location of each flexible bond is used to partition the molecule into rigid segments. A segment is the largest local set of atoms that contains only non-flexible bonds.

RETURN TO TABLE OF CONTENTS

Parameter	Description	Default Value
conformer_search_type	Choose the type of docking calculation: Rigid body docking (rigid), Flexible ligand docking (flex), de novo ligand design (denovo), Genetic Algorithm ligand construction (genetic), hierarchical database search (HDB), attach-and-grow (covalent)	flex

2.5.1.3. Manual Specification of Non-rotatable Bonds

Currently this functionality is not available!

The user can potentially specify additional bonds to be non-rotatable, to supplement the ring bonds automatically identified by DOCK. Such a technique could be used to preserve the conformation of part of a molecule and isolate it from the conformation search. Non-rotatable bonds are identified in the Tripos MOL2 format file containing the molecule. The bonds are designated as members of a STATIC BOND SET named RIGID (see Tripos MOL2 Format).

Creation of the RIGID set can be done within Chimera. With the molecule of interest loaded into Chimera, select the portion of the ligand you would like to remain rigid. Then select on File > Save MOL2. Make sure the "Write current selection to @ SETS section of file" is checked and save the file.

Alternatively, the RIGID set can be entered into the MOL2 file by hand. To do this, go to the end of the MOL2 file. If no sets currently exist, then add a SET identifier on a new line. It should contain the text "@<TRIPOS>SET". On a new line add the text "RIGID STATIC BONDS <user> **** Comment". On the next line enter the number of bonds that will be included in the set, followed by the numerical identifier of each bond in the set.

RETURN TO TABLE OF CONTENTS

2.5.1.4. Identification of Flexible Layers

Anchor Selection

An anchor segment is normally selected from the rigid segments in an automatic fashion (see Manual Specification of Non-rotatable Bonds to override this behavior). The molecule is divided into segments that overlap at each rotatable bond. The segment with the largest number of heavy atoms is selected as the first anchor, number of attachment points are also considered. All segments with more heavy atoms than min_anchor_size are tried separately as anchors. The number of anchors can be limited by setting the limit_max_anchors flag to "yes"; max_anchor_num is used to specify the maximum number of anchors to be used (anchors are ordered by heavy atoms and attachment points):

min_anchor_size 5
limit_max_anchors yes
max_anchor_num 5

At most 5 anchors are used and all anchors have at least 5 heavy atoms.

To use a single specific anchor (e.g scaffold with known binding pose), specify an atom name and its corresponding atom number in the chosen fragment (e.g. if atom number 10 is C16):

user_specified_anchor yes
atom_in_anchor C16,10

Identification of Overlapping Segments

When an anchor has been selected, then the molecule is redivided into non-overlapping segments, which are then arranged concentrically about the anchor segment. Segments are reattached to the anchor according to the innermost layer first and within a layer to the largest segment first.

Layered Non-Overlapping Segments

The anchor is processed separately (either oriented, scored, and/or minimized). The remaining segments are subsequently re-attached during the conformation search. The interaction energy between the receptor and the ligand can be optimized with a simplex minimizer (see Minimization).

RETURN TO TABLE OF CONTENTS

2.5.1.5. Pruning the Conformation Search Tree

Starting with version 6.1, there are two methods for pruning. The first method is the one that existed in earlier versions; it is the default and corresponds to input parameter pruning_use_clustering = yes. In this method pruning attempts to retain the best, most diverse configurations using a top-first pruning algorithm, which proceeds as follows. The configurations are ranked according to score. The top-ranked configuration is set aside and used as a reference configuration for the first round of pruning. All remaining configurations are considered candidates for removal. A root-mean-squared distance (RMSD) between each candidate and the reference configuration is computed. Each candidate is then evaluated for removal based on its rank and RMSD using the inequality:

If the factor is greater than number_confs_for_next_growth, as appropriate, the candidate is removed. Based on this factor, a configuration with rank 2 and 0.2 angstroms RMSD is comparable to a configuration with rank 20 and 2.0 angstroms RMSD. The next best scoring configuration which survives the first pass of removal is then set aside and used as a reference configuration for the second round of pruning, and so on. This pruning method biases its search time towards molecules that sample a more diverse set of binding modes. As the values of num_anchors_orients_for_growth and number_confs_for_next_growth are increased, the anchor-first method approaches an exhaustive search.

In the second method, the goal is to bias the sampling towards conformations that are close to the correct binding mode (as optimized using a test set of experimentally solved structures). Much as the method above, the algorithm ranks the generated poses and conformations. Then, all poses that violate a user-defined score cutoff are removed. To facilitate the speed of the calculation, the remaining list is additionally pared back to a user-defined length. In this method, the sampling is driven towards molecules that sample closer to the experimentally determined binding site, and the result is a significantly less diverse set of final poses.

RETURN TO TABLE OF CONTENTS

2.5.1.6 Time Requirements

The time demand grows linearly with thenumber of anchor segments explored for a given molecule, num_anchors_orients_for_growth, the number of flexible bonds and the number of torsion positions per bond, as well as the number_confs_for_next_growth. Using the notation in the Workflow for Anchor-and-Grow Algorithm, the time demand can be expressed as

where the additional terms are:
NA is the number of anchor segments tried per molecule.
NB is the number of rotatable bonds per molecule.

RETURN TO TABLE OF CONTENTS

2.5.1.7. Growth Tree and Statistics

Dock uses Breadth First Search to sample the conformational space of the ligand. The tree is pruned at every stage of growth to remove unsuitable conformations. In order to be as space efficient as possible, DOCK only saves one level of growth at a time unless "write_growth_tree" is turned on. In order to construct the growth tree it was necessary to do the following: (1) Retain all levels of growth (before and after minimization) in memory. (2) Link every conformer to its parent conformer during growth. (3) While writing out the tree, the traversal starts from a fully grown ligand (leaf), moving up the branch (parent conformer) until the ligand anchor (root) is reached. Finally, the growth tree branch is printed as a multi-mol2 file starting from the anchor to the fully grown ligand, including minimizations. This newly implemented feature allows visualization of all stages of growth and optimize behavior of current DOCK routines. Note that the growth trees can easily be visualized using the ViewDock module in the UCSF chimera program. Extra information regarding conformer number, anchor number, parent conformer etc. can also be accessed directly using this tool.

Format for branch files name is as follows:

${Ligand name}_anchor${anchor number}_branch${conformer number of fully grown mol.}.mol2

e.g. LIG1_anchor1_branch4.mol2

The ligand name is that specified in the mol2 file. The anchor number indicates what fragment or portion of the molecule was used as the anchor. The every conformer (both partially and fully grown) is assigned a unique number.
we recommend that users cat files together and compress them.

cat *_branch*.mol2 > growth_tree.mol2; gzip growth_tree.mol2

In addition, growth statistics are printed to the output files if the verbose flag is used.
-----------------------------------
VERBOSE MOLECULE STATS
Number of heavy atoms = 30
Number of rotatable bonds = 7
Formal Charge = 1.00
Molecular Weight = 429.56
Heavy Atoms = 30
-----------------------------------
VERBOSE ORIENTING STATS :
Orienting 10 anchor heavy atom centers
Sphere Center Matching Parameters:
tolerance: 0.25; dist_min: 2; min_nodes: 3; max_nodes: 10
Num of cliques generated: 2298
Residual Info:
min residual: 0.0261
median residual: 0.3932
max residual: 0.5000
mean residual: 0.3737
std residual: 0.0935
Node Sizes:
min nodes: 3
max nodes: 4
mean nodes: 3.0070
# of anchor positions: 1000
-----------------------------------
VERBOSE GROWTH STATS : ANCHOR #1
32/1000 anchor orients retained (max 1000) t=9.06s
Lyr 1-1 Segs|Lyr 2-1 Segs|Lyr 3-2 Segs|Lyr 4-2 Segs|Lyr 5-1 Segs|
Lyr:1 Seg:0 Bond:8 : Sampling 6 dihedrals C6(C.ar) C4(C.ar) C3(C.3) C1(C.3)
Lyr:1 Seg:0 24/192 retained, Pruning: 6-score 162-clustered t=10.68s
Lyr:2 Seg:0 Bond:5 : Sampling 3 dihedrals C4(C.ar) C3(C.3) C1(C.3) N1(N.3)
Lyr:2 Seg:0 51/72 retained, Pruning: 21-clustered t=11.38s
Lyr:3 Seg:0 Bond:1 : Sampling 3 dihedrals C3(C.3) C1(C.3) N1(N.3) S1(S.o2)
Lyr:3 Seg:0 105/153 retained, Pruning: 7-score 41-clustered t=13.37s
Lyr:3 Seg:1 Bond:3 : Sampling 6 dihedrals N4(N.am) C2(C.2) C1(C.3) C3(C.3)
Lyr:3 Seg:1 86/630 retained, Pruning: 8-score 536-clustered t=23.93s
Lyr:4 Seg:0 Bond:43 : Sampling 3 dihedrals C16(C.ar) S1(S.o2) N1(N.3) C1(C.3)
Lyr:4 Seg:0 90/258 retained, Pruning: 168-clustered t=28.85s
Lyr:4 Seg:1 Bond:26 : Sampling 2 dihedrals C11(C.3) N4(N.am) C2(C.2) C1(C.3)
Lyr:4 Seg:1 147/180 retained, Pruning: 5-score 28-clustered t=35.28s
Lyr:5 Seg:0 Bond:46 : Sampling 6 dihedrals C17(C.ar) C16(C.ar) S1(S.o2) N1(N.3)
Lyr:5 Seg:0 104/882 retained, Pruning: 15-outside grid 22-score 741-clustered t=77.71s

These are the verbose growth statistics for flexible docking to 1PPH (thrombin). These are printed only when the verbose flag is enabled in the command line. This feature is useful for debugging incomplete growths and other possible issues with the growth routines. This feature is also useful to show progress when docking in larger peptide-like ligands (20+ rotatable bonds) which can take several hours. Cumulative timing in seconds (e.g. t=13.37s) is shown at the end of each line to allow quick profiling of the slowest steps during docking. A separate section is printed for each anchor sampled when using multiple anchors. For anchor #1, the orienting routine produces 1000 orients, and 37 are retained after clustering and minimization. The ligand has 7 rotatable bonds. The second line shows the assignment of layers and segments. For details on the terminology, please consult the DOCK 4 paper. subsequently, two lines of information are printed for each torsion sampled.

Lyr:1 Seg:0 indicates that this is Layer #1 and Segment #0. Layer and segment number starts from zero, and corresponds to the array indices used internally. Bond:8 refers to bond number in the mol2 file read in. "Sampling 6 dihedrals C6(C.ar) C4(C.ar) C3(C.3) C1(C.3)" specifies the exact torsion being sampled. Six dihedral positions are being sampled in this case, as determined by the drive_id in flex_drive.tbl. 21/246 retained means 21 conformers were retained from the 246 conformers generated during growth (41 conformers x 6 dihedral positions = 246 new conformers). The Pruning: section demonstrates how these (246-21) or 225 conformers were pruned: 2 conformers were outside the energy grid, 5 conformers exceeded the score cut-off (see pruning_conformer_score_cutoff) and 218 conformers were clustered. Typically clustering removes the greatest number of conformers during each torsion grown as controlled by the pruning_clustering_cutoff parameter. The reader is encouraged to verify that the number of conformers retained can be calculated as above at each stage of growth. If the growth tree is turned on, the total number of conformers stored in the growth tree are also reported.

RETURN TO TABLE OF CONTENTS

2.5.1.9 Rigid Body and Flexible Ligand Docking Input Parameters

Parameter Description Default Value

user_specified_anchor Will the user specify an anchor file? no

atom_in_anchor If the user specifies an anchor, which atom label in the anchor? C1,1

limit_max_anchors Will the user limit the maximum number of anchors docked? no

max_anchor_num If the user limits the number - maximum number of anchors allowed 1

min_anchor_size Minimum number of atoms in the anchor 5

pruning_use_clustering Will pruning the conformers use a clustering algorithm? yes

pruning_max_orients How many orients will be generated prior to pruning? 1000

pruning_clustering_cutoff Maximum number of clusterheads retained from pruning 100

pruning_orient_score_cutoff Maximum score allowed for orientation of the anchor (kcal/mol) 1000.0

pruning_conformer_score_cutoff Maximum score allowed for conformers (kcal/mol) 100.0

pruning_conformer_score_scaling_factor Score cutoff scaling factor to increase of reduce the score cutoff as molecules rebuild 1.0

use_clash_overlap Flag to check for overlapping atomic volumes during anchor and grow no

clash_overlap A clash exists id the distance between a pair of atoms is less than the clash overlap times the sum of their atom type radii 0.5

write_growth_trees Generate large growth tree files (increases memory usage - recommended to concatenate and compress growth tree branches) no

Parameter	Description	Default Value
user_specified_anchor	Will the user specify an anchor file?	no
atom_in_anchor	If the user specifies an anchor, which atom label in the anchor?	C1,1
limit_max_anchors	Will the user limit the maximum number of anchors docked?	no
max_anchor_num	If the user limits the number - maximum number of anchors allowed	1
min_anchor_size	Minimum number of atoms in the anchor	5
pruning_use_clustering	Will pruning the conformers use a clustering algorithm?	yes
pruning_max_orients	How many orients will be generated prior to pruning?	1000
pruning_clustering_cutoff	Maximum number of clusterheads retained from pruning	100
pruning_orient_score_cutoff	Maximum score allowed for orientation of the anchor (kcal/mol)	1000.0
pruning_conformer_score_cutoff	Maximum score allowed for conformers (kcal/mol)	100.0
pruning_conformer_score_scaling_factor	Score cutoff scaling factor to increase of reduce the score cutoff as molecules rebuild	1.0
use_clash_overlap	Flag to check for overlapping atomic volumes during anchor and grow	no
clash_overlap	A clash exists id the distance between a pair of atoms is less than the clash overlap times the sum of their atom type radii	0.5
write_growth_trees	Generate large growth tree files (increases memory usage - recommended to concatenate and compress growth tree branches)	no

Configuration Files	Target
gnu	GNU compilers
gnu.acml	recent GNU compilers and ACML
gnu.parallel	GNU compilers with parallel processing capability
gnu.parallel.rdkit	GNU compilers with parallel processing capability and RDKit capability
gnu.rdkit	GNU compilers with RDKit capability
homebrew	GNU compilers installed on macOS using the Homebrew package manager
homebrew.rdkit	GNU compilers installed on macOS using the Homebrew package manager with RDKit capability
ibmaix	IBM AIX and native compilers
intel	Intel compilers
intel.mkl	Intel compilers and MKL
intel.parallel	Intel compilers with parallel processing capability
intel.intelmpi.parallel	Intel compilers with parallel processing capability (specific to Intel MPI)
pgi	PGI compilers
sgi	SGI native compilers

Parameter	Description	Default Value
dn_fraglib_scaffold_file	The path to the fragment library for just scaffolds
dn_fraglib_linker_file	The path to the fragment library for just linkers
dn_fraglib_sidechain_file	The path to the fragment library for just side chains
dn_user_specified_anchor	Choose whether to provide an anchor (yes), or have DOCK search for an anchor from within the fragment libraries.	yes
dn_fraglib_anchor_file	The path to the anchor .mol2 file
dn_use_torenv_table	Will torsion environment be used?	yes
dn_torenv_table	The path to the torsion environment (.dat)
dn_sampling_method	Choose which method to use when choosing fragments for each layer (exhaustive, random, graph)	graph
dn_graph_max_picks	The number of fragment picks per layer per Dummy atom when graph method is turned on	30
dn_graph_breadth	The number of fragments that are similar (using Tanimoto) to a successful fragment when graph method is turned on.	3
dn_graph_depth	Parameter that controls the overall number of fragment attempted via the following formula: Total_attempts=(max_picks)x=(breadth)^(depth).	2
dn_graph_temp	The beginning annealing temperature when graph method is turned on. Higher temperatures lead to greater likelihood of fragment selection even if score does not improve.	100.0
dn_num_random_picks	Number of fragments randomly chosen to add to anchor when random method is turned on
dn_pruning_conformer_score_cutoff	The max score allowed for fragment conformer addition to be accepted (kcal/mol) if that fragment is in the top (dn_max_layer_size) scoring of each layer.	100.0
dn_pruning_conformer_score_scaling_factor	Scaling factor which alters dn_pruning_conformer_score with each layer. Set the scaling factor to 1 then the cutoff stays the same at each layer; set it to 2 and it is halved at each layer.	2.0
dn_pruning_clustering_cutoff	Parameter which impacts clustering of anchor orients. The lower value the more anchor orients are kept.	100.0
dn_remove_duplicates	Determines whether duplicates of complete molecules are removed at the end of growth, based on Tanimoto.	yes
dn_max_duplicates_per_mol	Specifies the number of molecules to be kept in the ensemble - defaulted to no duplicates.	0
dn_write_pruned_duplicates	If turned on, any duplicates that are pruned by Tanimoto will be written to their own file.	no
dn_advanced_pruning	Master parameter for advanced pruning.	yes
dn_prune_initial_sample	Controls pruning of the initial attachments performed on a growing molecule	yes
dn_sample_torsions	Controls whether or not torsions are sampled. Constructing with internal energy only as the scoring function is the major reason to turn this off.	yes
dn_prune_individual_torsions	Controls pruning of each individual torsional sampling. Each growing molecule will have all of its potential growth paths pruned for each individual attempted attachment.	yes
dn_prune_combined_torsions	Controls ensemble pruning for each attachment point. This is performed after all attachments for a given attachment point have been attempted and all torsions sampled.	yes
dn_random_root_selection	Controls whether or not roots for the next layer will be selected randomly, rather than the best by scoring function. Constructing with internal energy only as the scoring function is the major reason to turn this off.	no
dn_mol_wt_cutoff_type	Parameter which determines the use of a "hard" upper and lower cutoff for molecular weight (no molecules beyond those values) or a "soft" cutoff (accepted based on standard deviation).	soft
dn_upper_constraint_mol_wt	The max molecular weight allowed, or the upper boundary before the standard deviation is assessed.	550
dn_lower_constraint_mol_wt	The minimum molecular weight allowed, or the lower boundary before the standard deviation is assessed.	0
dn_mol_wt_std_dev	If using a soft molecular weight cutoff, this is the standard deviation that determines the probability of acceptance beyond the cutoffs.	35.0
dn_constraint_rot_bon	The max rotatable bonds allowed	15
dn_constraint_formal_charge	Largest absolute charge of molecule	2.0
dn_heur_unmatched_num	# of unmatched heavy atoms for hRMSD to be considered a different molecule when pruning between root and layer	1
dn_heur_matched_rmsd	Max rmsd of matching heavy atoms of molecules to be considered similar molecule when pruning root to layer	2.0
dn_unique_anchors	The number of unique anchors post clustering	3
dn_max_grow_layers	The maximum number of layers that can be grown from a given starting anchor.	9
dn_max_root_size	The number of new anchors that seed the next layer of growth	25
dn_max_layer_size	The number of partially grown molecules that advance through the search to subsequent attachment points	25
dn_max_current_aps	Maximum attachment points the molecule can have at any one time (stop adding new scaffolds when the current fragment has this many open attachment points)	5
dn_max_scaffolds_per_layer	The max number of scaffolds added per layer per molecule	1
dn_write_checkpoints	Write molecules for each layer	yes
dn_write_prune_dump	Write all molecules pruned out at every step (large memory output)	yes
dn_write_orients	Write out the orients	no
dn_write_growth_trees	Shows growth for every accepted molecule (may produce large output files)	no
dn_output_prefix	The prefix of the output file with final molecules	output

	Details			Filtering and Pruning Reasons
Nomenclature	Finalized with?	Partially grown?	Valid molecules?	Prune via size max Layer	Filter via MW/ROT/FC	Prune via size of max Root	PruneRMSD via hRMSD	Output .mol2 nomenclature
Completed	Sidechains	N	Y	N/A	Pass	N/A	N/A	output.completed.denovo_build.mol2
Completed: Filtered	Sidechains	N	Y	N/A	Failed	N/A	N/A	${output}.anchor_{#}.filtered_comp_layer_{##}.mol2
Pruned via Root	Hydrogens	Y	Y	Failed	Failed	N/A	N/A	${output}.anchor_{#}.prune_root_layer_{##}.mol2
Pruned via RMSD	Hydrogens	Y	Y	N/A	N/A	N/A	Failed	${output}.anchor_{#}.prune_rmsd_mw_layer_{##}.mol2
Ignored Candidate Root	Hydrogens	Y	Y	N/A	N/A	Failed	N/A	${output}.anchor_{#}.cand_root_ign_layer_{##}.mol2
Propagated Root	Attachment Points	Y	N	N/A	N/A	N/A	N/A	${output}.anchor_{#}.root_layer_{##}.mol2

Parameter	Description	Default Value
dn_drive_verbose	Turn on verbose D3N to output more information	no
dn_save_all_molecules	Save all molecules that are rejected by D3N	no
dn_drive_clogp	Turn on if you want to bias construction with cLogP	no
dn_lower_clogp	Lower limit of the range	-0.30
dn_upper_clogp	Upper limit of the range	3.75
dn_clogp_std_dev	Standard deviation of the Metropolis-like procedure	2.02
dn_drive_esol	Turn on if you want to bias construction with ESOL(LogS)	no
dn_lower_esol	Lower limit of the range	-5.23
dn_upper_esol	Upper limit of the range	-1.35
dn_esol_std_dev	Standard deviation of the Metropolis-like procedure	1.94
dn_drive_tpsa	Turn on if you want to bias construction with TPSA	no
dn_lower_tpsa	Lower limit of the range	28.53
dn_upper_tpsa	Upper limit of the range	113.20
dn_tpsa_std_dev	Standard deviation of the Metropolis-like procedure	42.33
dn_drive_qed	Turn on if you want to bias construction with QED	no
dn_lower_qed	Lower limit of the range	0.61
dn_qed_std_dev	Standard deviation of the Metropolis-like procedure	0.19
dn_drive_sa	Turn on if you want to bias construction with Synthetic Accessibility (SynthA/sa)	no
dn_upper_sa	Upper limit of the range	3.34
dn_sa_std_dev	Standard deviation of the Metropolis-like procedure	0.9
dn_drive_stereocenters	Turn on if you want to bias construction with number of stereocenters	no
dn_upper_stereocenter	Upper limit of the range	2
dn_drive_pains	Turn on if you want to bias construction with number of Pan-assay Interference Compounds (PAINS/pains)	no
dn_upper_pains	Upper limit of the range	1
dn_start_at_layer	Layer number you want to start D3N	1
sa_fraglib_path	Path to SynthA parameters	sa_fraglib_path
PAINS_path	Path to PAINS parameters	pains_table_2019_09_01.dat

Parameter	Description	Default Value
ga_molecule_file	Initial molecule ensemble in mol2 format	ga_molecule_file.mol2
ga_utilities	Use of GA utilities for this run	no
ga_fraglib_scaffold_file	Fragment library mol2 file containing scaffolds	fraglib_scaffold.mol2
ga_fraglib_linker_file	Fragment library mol2 file containing linkers	fraglib_linker.mol2
ga_fraglib_sidechain_file	Fragment library mol2 file containing sidechains	fraglib_sidechain.mol2
ga_torenv_table	Path to the torsion environment table (.dat)	fraglib_torenv.mol2
ga_max_generations	Max number of generations to be executed	100
ga_xover_sampling_method_rand	Selects the random or exhaustive crossover sampling methods (yes for random)	yes
ga_xover_max	Max number of offspring generated from crossover events	150
ga_bond_tolerance	User-specified cutoff for allowable atom sq dist	0.5
ga_angle_cutoff	User-specified cutoff for bond angles	0.14
ga_check_overlap	Output parent pairs involved in crossover in unique_xover.mol2 file.	no
ga_check_only	Only check parent pairs for crossover (no offspring, mutations, or selection).	no
ga_mutate_parents	Mutate the parents	no
ga_pmut_rate	Parent mutation rate - only used when random parent mutation is turned on	0.3
ga_omut_rate	Offspring mutation rate for random mutation sampling	0.7
ga_max_mut_cycles	Max mutation attempts - for random mutation sampling	5
ga_num_random_picks	Number of random picks for random sampling	15
ga_max_root_size	Max root size in de novo DOCK	5
ga_energy_cutoff	The upper bounds for energy pruning	100
ga_heur_unmatched_num	The number of unmatched atoms for hRMSD pruning	1
ga_heur_matched_rmsd	The RMSD of matched atoms for hRMSD pruning	0.5
ga_constraint_mol_wt	The upper bound for mol wt	500
ga_constraint_rot_bon	The upper bound for # rot bonds	10
ga_constraint_H_accept	The upper bound for # of hydrogen acceptors	10
ga_constraint_H_donor	The upper bound for # of hydrogen donors	5
ga_constraint_formal_charge	The upper and lower bound for formal charge	2
ga_ensemble_size	The number of survivors to carry to next generation	200
ga_selection_method	The type of selection (elitism, tournament, roulette)	elitism
ga_elitism_combined	Combine the parent and offspring populations for the elitism selection method?	yes
ga_elitism_option	Type of selection:max, percent (perc), or number(num) when elitism is the selection method	max
ga_elitism_number	The number of top scored parents to pass to the next generation when elitism number is the selection method	20
ga_elitism_percent	The top percent of the ensemble(s) to be carried to the next generation when elitism percent is the selection method	0.2
ga_tournament_p_vs_c	Select the top scored molecules between parents and offspring separately when tournament selection method is used	yes
ga_roulette_separate	Select the top scored molecule between parents and offspring separately when roulette selection method is used	yes
ga_max_num_gen_with_no_crossover	The max number of generations where crossover is not necessary for continuing evolution. After that generation without crossover the algorithm terminates.	25
ga_name_identifier	Molecule names prefix in the output files	ga
ga_output_prefix	Output file name prefix	ga_output

Parameter	Description	Default Value
sa_fraglib_path	Path to SynthA parameters	sa_fraglib.dat
PAINS_path	Path to PAINS parameters	pains_table_2019_09_01.dat

Parameter	Description	Default Value
num_per_search	Number of poses kept for each orient. The HDB hierarchy is oriented, and then the hierarchy is searched, and a user-specified number of poses are kept. (If the minimizer is turned on, these poses are all minimized.)	1
skip_broken	If atoms within a conformation (a branch in the hierarchy) are too close, the conformation is flagged as broken in the DB2 file. If this parameter is "yes", these conformations are skipped.	no
hdb_db2_input_file	This parameter is for reading in DB2 files. The parameter can take a split_database_index file (as in DOCK 3), which contains a list of db2.gz files; or the parameter can take a single db2.gz file.	sdi.txt
hdb_db2_search_score_threshold	This is the score cutoff for each segment of the branch. If the score-cutoff is exceeded, then the segment is flagged and all branches containing the segment are halted. (for Grid score and ChemGrid score only VDW term is considered.)	10.0

Parameter	Description	Default Value
covalent_bondlength	This defines the bond length between dummy1 (SG) and dummy2 (CB). The user can give a value, or define a range of values. start:step:stop or start:stop (assumes a step size of 0.1).	1.8
covalent_bondlength2	This defines the bond length between dummy1 (SG) and the ligand atom attachment point. The user can give a value, or define a range of values. start:step:stop or start:stop (assumes a step size of 0.1). A value of -1 will use the bond length of the input structure.	-1
covalent_angle	This defines the angle between dummy2 (CB), dummy1 (SG) and the ligand atom attachment point. The user can give a value, or define a range of values. start:step:stop or start:stop (assumes a step size of 0.1). A value of -1 will use the angle of the input structure.	-1
covalent_dihedral_step	Parameter that defines the dihedral step size (0 to 360). The dihedral is defined by sphere 3(CA), dummy2 (CB), dummy1 (SG) and the ligand atom attachment point. a value of 10 means that 36 dihedral are explored.	10

Parameter	Description	Default Value
write_fragment_libraries	Does the user want to write fragment libraries? (By default, set to no. Must be 'yes' to write libraries)	no
fragment_library_prefix	Choose a prefix for fragment library file	fraglib
fragment_library_sort_method	Choose from frequency (freq) or fingerprint (fingerprint) for sorting the fragment library, when fragment library is turned on	freq
fragment_library_trans_origin	Translate all fragments to a single origin point (no: fragments remain in their original positioning in 3D space)	yes
fragment_library_freq_cutoff	When frequency fragment library is turned on then what is the minimum frequency allowed in the fragment library? (1 = all fragments allowed)	1

Parameter	Description	Default Value
use_database_filter	Does the user want to use database filter?	no
dbfilter_max_heavy_atoms	Maximum number of ligand heavy atoms	999
dbfilter_min_heavy_atoms	Minimum number of ligand heavy atoms	0
dbfilter_max_rot_bonds	Maximum number of ligand rotatable bonds	999
dbfilter_min_rot_bonds	Minimum number of ligand rotatable bonds	0
dbfilter_max_molwt	Maximum ligand molecular weight	9999.0
dbfilter_min_molwt	Minimum ligand molecular weight	0.0
dbfilter_max_formal_charge	Maximum ligand formal charge	10.0
dbfilter_min_formal_charge	Minimum ligand formal charge	-10.0

Parameter	Description	Default Value
dbfilter_max_stereocenters	Upper limit to number of stereocenters	6
dbfilter_min_stereocenters	Lower limit to number of stereocenters	0
dbfilter_max_spiro_centers	Upper limit to number of spiro_centers	6
dbfilter_min_spiro_centers	Lower limit to number of spiro_centers	0
dbfilter_max_clogp	Upper limit to number of cLogP	20
dbfilter_min_clogp	Lower limit to number of cLogP	-20
dbfilter_max_logs	Upper limit to number of LogS	20
dbfilter_min_logs	Lower limit to number of LogS	-20
dbfilter_max_tpsa	Upper limit to number of TPSA	1000.0
dbfilter_min_tpsa	Lower limit to number of TPSA	0.0
dbfilter_max_qed	Upper limit to number of QED	1.0
dbfilter_min_qed	Lower limit to number of QED	0.0
dbfilter_max_sa	Upper limit to number of SynthA	10
dbfilter_min_sa	Lower limit to number of SynthA	1
dbfilter_max_pns	Upper limit to number of PAINS	100
dbfilter_sa_fraglib_path	Path to SynthA parameters	sa_fraglib.dat
dbfilter_PAINS_path	Path to PAINS parameters	pains_table_2019_09_01.dat

Descriptor	Description
RD_num_arom_rings	Number of aromatics rings
RD_num_alip_rings	Number of aliphatic rings
RD_num_sat_rings	Number of saturated rings
RD_PAINS_names	List of PAINS if there are hits
RD_SMILES	SMILES string representing the molecule

Parameter	Description	Default Value
calculate_rmsd	Calculate root mean square deviation?	no
use_rmsd_reference_mol	Does the user want to use a reference molecule to calculate rmsd?	no
rmsd_reference_filename	The path to the rmsd reference molecule	N/A

Parameter	Description	Default Value
orient_ligand	Does the user want to orient the ligand to spheres?	yes
automated_matching	Does the user want to perform automated matching instead of manual matching?	yes
distance_tolerance	The tolerance in angstroms within which a pair of spheres is considered equivalent to a pair of centers (only turned on if manual matching is used)	0.25
distance_minimum	The shortest distance allowed between 2 spheres - any sphere pair with a shorter distance is disregarded (only turned on with manually matching is used)	0.0
nodes_minimum	The minimum number of nodes in a clique	3
nodes_maximum	The maximum number of nodes in a clique	10
receptor_site_file	The path to the file containing the receptor spheres	receptor.sph
max_orientations	The maximum number of orientations that will be cycled through	1000
critical_points	Does the user want to use critical point sphere labeling to target orientations to particular spheres?	no
chemical_matching	Does the user want to use chemical coloring of spheres to match chemical labels on ligand atoms?	no
chem_match_tbl	The path to the file defining the legal chemical type matches/pairings (only turned on when chemical matching is used)	chem_match.tbl
use_ligand_spheres	Does the user want to use a sphere file representing ligand heavy atoms to orient the ligand? (typically used for macromolecular docking)	no
ligand_sphere_file	The path to the file containing the ligand sphere files (only turned on when use ligand spheres is used)	ligand.sph

Parameter	Description	Default Value
use_internal_energy	Does the user want to use internal energy for growth and or minimization (only repulsive VDW)	yes
internal_energy_rep_exp	The VDW exponent only when use internal energy is turned on(DOCK is optimized for default value)	12
internal_energy_cutoff	All conformers with an internal energy value above this cutoff are pruned(only turned on use internal energy is used)	100.0

Parameter	Description	Default Value
bump_filter	Does the user want to perform bump filter?	no
bump_grid_prefix	The prefix to the grid file containing the desired bump grid (only turned on when bump filter is used)	grid
max_bumps_anchor	The maximum allowed number of bumps for an anchor to pass the filter	12
max_bumps_growth	The maximum allowed number of bumps for a molecule to pass the filter	12

Parameter	Description	Default Value
contact_score_primary	Does the user want to perform contact scoring as primary scoring function	no
contact_score_cutoff_distance	The distance threshold defining a contact when contact scoring is turned on	4.5
contact_score_clash_overlap	Contact definition for use with intramolecular scoring when contact scoring is turned on	0.75
contact_score_clash_penalty	The penalty for each contact overlap made when contact score is turned on	50
contact_score_grid_prefix	The prefix to the grid files containing the desired contact when contact score is turned on	grid

Parameter	Description	Default Value
grid_score_primary	Does the user want to perform grid-based energy scoring as the primary scoring function?	yes
grid_score_rep_rad_scale	Scalar multiplier of the radii for the repulsive portion of the VDW energy component only when grid score is turned on	1.0
grid_score_vdw_scale	Scalar multiplier of the VDW energy component	1
grid_score_turn_off_vdw	A flag to turn off vdw portion of scoring function when grid score vdw scale = 0	yes
grid_score_es_scale	Flag to scale up or down the es portion of the scoring function when es scale is turned on	1
grid_lig_efficiency	Flag to control use of ligand efficiency (Grid Score / # active heavy atoms) as primary score.	no
grid_score_turn_off_es	A flag to turn off es portion of scoring function when grid score es scale = 0	yes
grid_score_grid_prefix	The prefix to the grid files containing the desired nrg/bmp grid	grid

Parameter	Description	Default Value
descriptor_grid_score_rep_rad_scale	Scalar multiplier of the radii for the repulsive portion of the VDW energy component only when grid score is turned on	1.0
descriptor_grid_score_vdw_scale	Scalar multiplier of the VDW energy component	1
descriptor_grid_score_turn_off_vdw	A flag to turn off vdw portion of scoring function when grid score vdw scale = 0	yes
descriptor_grid_score_es_scale	Flag to scale up or down the es portion of the scoring function when es scale is turned on	1
descriptor_grid_score_turn_off_es	A flag to turn off es portion of scoring function when grid score es scale = 0	yes
descriptor_grid_score_grid_prefix	The prefix to the grid files containing the desired nrg/bmp grid	grid

Parameter	Description	Default Value
dock3.5_score_primary	Does the user want to perform dock3.5 scoring as the primary scoring function?	no
dock3.5_vdw_score	When dock3.5 scoring is turned on - calculate steric interaction from dock3.5 score	yes
dock3.5_grd_prefix	When dock3.5 scoring is turned on - path to files containing dock3.5 grids	chem52
dock3.5_electrostatic_score	When dock3.5 scoring is turned on - calculate electrostatic interaction from ESP grid calculated using DelPhi	yes
dock3.5_ligand_internal_energy	Flag to add ligand internal energy to the scoring function	no
dock3.5_ligand_desolvation_score	Calculate total or volume based ligand desolvation from solvation grids	volume
dock3.5_solvent_occlusion_file	Occluded solvent grid of the receptor when desolvation score is turned on	solvmap
dock3.5_redistribute_positive_desolvation	Distribute positive partial atomic desolvation penalties when desolvation score is turned on	no
dock3.5_write_atomic_energy_contrib	Write contribution from each atom to total score	no
dock3.5_score_vdw_scale	Scalar multiplier of vdw energy component	1.0
dock3.5_score_es_scale	Scalar multiplier of es energy component	1.0

Parameter	Description	Default Value
continuous_score_primary	Does the user want to perform continuous non-grid scoring as the primary scoring function?	no
cont_score_rec_filename	File that contains receptor coordinates	receptor.mol2
cont_score_att_exp	VDW Lennard-Jones potential attractive exponent	6
cont_score_rep_exp	VDW Lennard-Jones potential repulsive exponent	12
cont_score_rep_rad_scale	Scalar multiplier of the radii for the repulsive portion of the vdw energy component only	1.0
cont_score_use_dist_dep_dielectric	Distance dependent dielectric switch	yes
cont_score_dielectric	Dielectric constant for the electrostatic term
cont_score_vdw_scale	Scalar multiplier of vdw energy component	1
cont_score_vdw_scale	Flag to turn off vdw portion of the scoring function when cont_score_vdw_scale=0	yes
cont_score_es_scale	Scalar multiplier of electrostatic energy component	1.0
cont_score_turn_off_es	Flag to turn off es portion of the scoring function when cont_score_es_scale = 0	yes

Parameter	Description	Default Value
descriptor_cont_score_rec_filename	File that contains receptor coordinates	receptor.mol2
descriptor_cont_att_exp	VDW Lennard-Jones potential attractive exponent	6
descriptor_cont_score_rep_exp	VDW Lennard-Jones potential repulsive exponent	12
descriptor_cont_score_rep_rad_scale	Scala multiplier of the radii for the repulsive portion of the VDW energy component only	1.0
descriptor_cont_score_use_dist_dep_dielectric	Distance dependent dielectric switch	yes
descriptor_cont_score_dielectric	Dielectric constant for electrostatic term
descriptor_cont_score_vdw_scale	Scalar multiplier of vdw energy component	1
descriptor_cont_score_turn_off_vdw	Flag to turn off vdw portion of scoring function when descriptor_cont_score_vdw_scale = 0	yes
descriptor_cont_score_es_scale	Scalar multiplier of es energy component	1
descriptor_cont_score_turn_off_es_scale	Flag tot turn off es portion of scoring function when descriptor_cont_score_es_scale=0	yes

Parameter	Description	Default Value
gbsa_zou_score_primary	Flag to perform Zou GB/SA scoring as the primary scoring function	no
gbsa_zou_gb_grid_prefix	The path to the pairwise GB grids	gb_grid
gbsa_zou_sa_grid_prefix	The path to the SA grids	sa_grid
gbsa_zou_vdw_grid_prefix	The path to the nrg grids, used for the vdw portion of the GB/SA calculation	grid
gbsa_zou_screen_file	GB parameter file for electrostatic screening. Its located in the parameter dir by default	screen.in
gbsa_zou_solvent_dielectric	The value for the solvent dielectric	78.300003

Parameter	Description	Default Value
gbsa_hawkins_score_primary	Flag to perform Hawkins GB/SA scoring as the primary scoring function	no
gbsa_hawkins_score_rec_filename	File that contains receptor coordinates	receptor.mol2
gbsa_hawkins_score_solvent_dielectric	Dielectric constant for solvent	78.5
gbsa_hawkins_use_salt_screen	use salt screening	no
gbsa_hawkins_score_salt_conc(M)	When salt screen is turned on, salt concentration for solvent at molar concentration	0.0
gbsa_hawkins_score_gb_offset	GB radius offset	0.09
gbsa_hawkins_score_cont_vdw_and_es	Flag to determine whether vdw and es values will be calculated continuously or from a grid	yes
gbsa_hawkins_score_vdw_att_exp	VDW Lennard-Jones potential attractive exponent	6
gbsa_hawkins_score_vdw_rep_exp	VDW Lennard-Jones potential repulsive exponent	12
grid_score_rep_rad_scale	Scalar multiplier for the radii for the repulsive portion of the vdw energy component only, not specific to gbsa_hawkins_score	1.0
gbsa_hawkins_score_grid_prefix	The prefix of the grid file containing the vdw values when gbsa_hawkins_score_grid_cont_vdw_and_es = no	grid

Parameter	Description	Default Value
amber_score_primary	Flag to perform amber scoring as the primary scoring function	no
amber_receptor_file_prefix	Prefix of the file that contains receptor coordinates that was used in the prepare_amber.pl input files preparation step.	rec
amber_score_movable region	The region that will be flexible during the scoring protocol.(Options: distance, everything, ligand, nab_atom_expression, nothing)	ligand
receptor_site_file	The file containing the receptor spheres that define the active site. This is not specific to amber score. This is active for amber_score_movable_region=distance.	receptor.sph
amber_score_receptor_movable_atom_expr	NAB atom expression defining the movable receptor region. This is active only for amber_score_movable_region=nab_atom_expression
amber_score_complex_movable_atom_expr	NAB atom expression defining the movable complex region. This is active only for amber_score_movable_region=nab_atom_expression
amber_score_minimization_rmsgrad	Minimization convergence criterion for the root-mean-square of the components of the gradient.	0.01
amber_score_before_md_minimization_cycles	Number of conjugate gradient minimization cycles to be performed before MD.	100
amber_score_md_steps	Number of Molecular Dynamics (MD) steps to be performed.	3000
amber_score_after_md_minimization_cycles	Number of conjugate gradient minimization cycles to be performed after MD.	100
amber_score_gb_model	GB model to be used	5
amber_score_nonbonded_cutoff	Non-bonded cutoff in angstroms for the energy calculation.	18.0
amber_score_temperature	Temperature at which MD should be performed.	300
amber_score_abort_on_unprepped_ligand	Control over the behavior for an unprepped ligand.	yes

Parameter	Description	Default Value
footprint_similarity_score_primary	Flag to perform footprint scoring as the primary scoring function	no
fps_score_use_footprint_reference_mol2	Use a molecule to calculate footprint reference.	no
fps_score_footprint_reference_mol2_filename	Path to the reference mol2 file - only used when footprint reference mol2 is turned on.	ligand_footprint.mol2
fps_score_use_footprint_reference_txt	Use a pre-calculated footprint reference in text format.	no
fps_score_footprint_reference_txt_filename	Path to the reference txt file - only used when footprint reference txt is turned on.	ligand_footprint.txt
fps_score_foot_compare_type	Footprint similarity calculation methods (Options: Euclidean, Pearson). If Pearson, the correlation coefficient as the metric to compare the footprints. When the value is 1 then there is perfect agreement between the two footprints. When the value is 0 then there is poor agreement between the two footprints. If Euclidean, the Euclidean distance as the metric to compare the footprints. When the value is 0 then there is perfect agreement between the two footprints. As the agreement gets worse between the two footprints the value increases.	Euclidean
fps_score_normalize_foot	normalization is used only with Euclidean distance.	no
fps_score_foot_comp_all_residue	If yes all residues are used for calculating the footprint.	yes
fps_score_choose_foot_range_type	User can use to determine the type of the range of the footprint by either specifying a residue range or defining a threshold. If specify_range, the user chooses to use a residue range and all footprints will be evaluated only on this residue range. First residue id = 1 not 0. If threshold, the user chose to use a residue range that is defined by only residues that have magnitudes that exceed the specified thresholds. (Options: specify_range, threshold)	specify_range
fps_score_vdw_threshold	Specify threshold for van der Waals energy, when threshold is turned on.	1
fps_score_es_threshold	Specify threshold for electrostatic energy, when threshold is turned on.	1
fps_score_hb_threshold	specify threshold for hydrogen bonds (integers). 0.5 means that all none zeros are used, when threshold is turned on.	0.5
fps_score_use_remainder	Interaction remainder is all remaining residues not included individually	yes
fps_score_rec_filename	File that contains receptor coordinates	receptor.mol2
fps_score_att_exp	VDW Lennard-Jones potential attractive exponent	6
fps_score_rep_exp	VDW Lennard-Jones potential repulsive exponent	12
fps_score_rep_rad_scale	Scalar multiplier of the radii for the repulsive portion of the VDW energy component ONLY	1
fps_score_use_distance_dependent_dielectric	Distance dependent dielectric switch	yes
fps_score_dielectric	Dielectric constant for electrostatic term	4.0
fps_score_vdw_scale	Scalar multiplier of vdw energy component	1
fps_score_es_scale	Scalar multiplier of es energy component	1
fps_score_hb_scale	Scalar multiplier of hb energy component	0
fps_score_internal_scale	Scalar multiplier of internal energy component	0
fps_score_fp_vwd_scale	Scalar multiplier of vdw footprint component	0
fps_score_fp_es_scale	Scalar multiplier of es footprint component	0
fps_score_fp_hb_scale	Scalar multiplier of hb footprint component	0

Parameter	Description	Default Value
descriptor_fps_score_use_footprint_reference_mol2	Use a molecule to calculate footprint reference.	no
descriptor_fps_score_footprint_reference_mol2_filename	Path to the reference mol2 file - only used when footprint reference mol2 is turned on.	ligand_footprint.mol2
descriptor_fps_score_use_footprint_reference_txt	Use a pre-calculated footprint reference in text format.	no
descriptor_fps_score_footprint_reference_txt_filename	Path to the reference txt file - only used when footprint reference txt is turned on.	ligand_footprint.txt
descriptor_fps_score_foot_compare_type	Footprint similarity calculation methods (Options: Euclidean, Pearson). If Pearson, the correlation coefficient as the metric to compare the footprints. When the value is 1 then there is perfect agreement between the two footprints. When the value is 0 then there is poor agreement between the two footprints. If Euclidean, the Euclidean distance as the metric to compare the footprints. When the value is 0 then there is perfect agreement between the two footprints. As the agreement gets worse between the two footprints the value increases.	Euclidean
descriptor_fps_score_normalize_foot	normalization is used only with Euclidean distance.	no
descriptor_fps_score_foot_comp_all_residue	If yes all residues are used for calculating the footprint.	yes
descriptor_fps_score_choose_foot_range_type	User can use to determine the type of the range of the footprint by either specifying a residue range or defining a threshold. If specify_range, the user chooses to use a residue range and all footprints will be evaluated only on this residue range. First residue id = 1 not 0. If threshold, the user chose to use a residue range that is defined by only residues that have magnitudes that exceed the specified thresholds. (Options: specify_range, threshold)	specify_range
descriptor_fps_score_vdw_threshold	Specify threshold for van der Waals energy, when threshold is turned on.	1
descriptor_fps_score_es_threshold	Specify threshold for electrostatic energy, when threshold is turned on.	1
descriptor_fps_score_hb_threshold	specify threshold for hydrogen bonds (integers). 0.5 means that all none zeros are used, when threshold is turned on.	0.5
descriptor_fps_score_use_remainder	Interaction remainder is all remaining residues not included individually	yes
descriptor_fps_score_rec_filename	File that contains receptor coordinates	receptor.mol2
descriptor_fps_score_att_exp	VDW Lennard-Jones potential attractive exponent	6
descriptor_fps_score_rep_exp	VDW Lennard-Jones potential repulsive exponent	12
descriptor_fps_score_rep_rad_scale	Scalar multiplier of the radii for the repulsive portion of the VDW energy component ONLY	1
descriptor_fps_score_use_distance_dependent_dielectric	Distance dependent dielectric switch	yes
descriptor_fps_score_dielectric	Dielectric constant for electrostatic term	4.0
descriptor_fps_score_vdw_scale	Scalar multiplier of vdw energy component	1
descriptor_fps_score_es_scale	Scalar multiplier of es energy component	1
descriptor_fps_score_hb_scale	Scalar multiplier of hb energy component	0
descriptor_fps_score_internal_scale	Scalar multiplier of internal energy component	0
descriptor_fps_score_fp_vwd_scale	Scalar multiplier of vdw footprint component	0
descriptor_fps_score_fp_es_scale	Scalar multiplier of es footprint component	0
descriptor_fps_score_fp_hb_scale	Scalar multiplier of hb footprint component	0

Parameter	Description	Default Value
multigrid_score_primary	Flag to perform MultiGrid FPS scoring as the primary scoring function	no
multigrid_score_rep_rad_scale	Scale the VDW repulsive exponent only by this amount.	1.0
multigrid_score_vdw_scale	Scale both VDW terms in the molecular mechanics interaction function by this amount.	1.0
multigrid_score_es_scale	Scale both ES terms in the molecular mechanics interaction function by this amount.	20
multigrid_score_number_of_grids	The number of grids to be used, which typically includes N selected grids plus one remainder grid.	20
multigrid_score_grid_prefix0	Provide prefixes to identify the grids. Note that the first grid starts at '0'. The last grid should be the remainder grid. This must be done for each grid.	multigrid0
multigrid_score_individual_rec_ensemble	Flag for individual receptor (standard) or multiple receptor (not implemented yet).	no
multigrid_score_weights_text	Flag for providing a textfile as input for the reference footprint.	no
multigrid_score_footprint_text	Name of the reference footprint input text file, when multigrid_score_weight_text is turned on.	reference.txt
multigrid_score_fp_ref_mol	Flag for providing a MOL2 as input for the reference footprint.	no
multigrid_score_footprint_ref	Name of the reference footprint input MOL2 file, when multigrid_score_fp_ref_mol is turned on.	reference.mol2
multigrid_score_foot_compare_type	Footprint similarity calculation methods (Options: Euclidean, Pearson). If Pearson, the correlation coefficient as the metric to compare the footprints. When the value is 1 then there is perfect agreement between the two footprints. When the value is 0 then there is poor agreement between the two footprints. If Euclidean, the Euclidean distance as the metric to compare the footprints. When the value is 0 then there is perfect agreement between the two footprints. As the agreement gets worse between the two footprints the value increases.	Euclidean
multigrid_score_normalize_foot	normalization is used only with Euclidean distance.	no
multigrid_score_vdw_euc_scale	Scaling factor for VDW term. when using euclidean	1.0
multigrid_score_es_euc_scale	Scaling factor for ES term when using euclidean	1.0
multigrid_score_vdw_norm_scale	Scaling factor for VDW term. euclidean and normalize	10.0
multigrid_score_es_norm_scale	Scaling factor for ES term. Flags if using Pearson Correlation similarity metric for footprint comparison.	10.0
multigrid_score_vdw_cor_scale	Scaling factor for VDW term	-10.0
multigrid_score_es_cor_scale	Scaling factor for ES term	-10.0

Parameter	Description	Default Value
descriptor_multigrid_score_rep_rad_scale	Scale the VDW repulsive exponent only by this amount.	1.0
descriptor_multigrid_score_vdw_scale	Scale both VDW terms in the molecular mechanics interaction function by this amount.	1.0
descriptor_multigrid_score_es_scale	The number of grids to be used, which typically includes N selected grids plus one remainder grid.	20
descriptor_multigrid_score_number_of_grids	Path to the reference txt file - only used when footprint reference txt is turned on.	ligand_footprint.txt
descriptor_multigrid_score_grid_prefix0	Provide prefixes to identify the grids. Note that the first grid starts at '0'. The last grid should be the remainder grid. This must be done for each grid.	multigrid0
descriptor_multigrid_score_individual_rec_ensemble	Flag for individual receptor (standard) or multiple receptor (not implemented yet).	no
descriptor_multigrid_score_weights_text	Flag for providing a textfile as input for the reference footprint.	no
descriptor_multigrid_score_footprint_text	Name of the reference footprint input text file, when multigrid_score_weight_text is turned on.	reference.txt
descriptor_multigrid_score_fp_ref_mol	Flag for providing a MOL2 as input for the reference footprint.	no
descriptor_multigrid_score_footprint_ref	Name of the reference footprint input MOL2 file, when multigrid_score_fp_ref_mol is turned on.	reference.mol2
descriptor_multigrid_score_foot_compare_type	Footprint similarity calculation methods (Options: Euclidean, Pearson). If Pearson, the correlation coefficient as the metric to compare the footprints. When the value is 1 then there is perfect agreement between the two footprints. When the value is 0 then there is poor agreement between the two footprints. If Euclidean, the Euclidean distance as the metric to compare the footprints. When the value is 0 then there is perfect agreement between the two footprints. As the agreement gets worse between the two footprints the value increases.	Euclidean
descriptor_multigrid_score_normalize_foot	normalization is used only with Euclidean distance.	no
descriptor_multigrid_score_vdw_euc_scale	Scaling factor for VDW term. when using euclidean	1.0
descriptor_multigrid_score_es_euc_scale	Scaling factor for ES term when using euclidean	1.0
descriptor_multigrid_score_vdw_norm_scale	Scaling factor for VDW term. euclidean and normalize	10.0
descriptor_multigrid_score_es_norm_scale	Scaling factor for ES term. Flags if using Pearson Correlation similarity metric for footprint comparison.	10.0
descriptor_multigrid_score_vdw_cor_scale	Scaling factor for VDW term	-10.0
descriptor_multigrid_score_es_cor_scale	Scaling factor for ES term	-10.0

Parameter	Description	Default Value
pharmacophore_score_primary	Flag to perform FMS scoring as the primary scoring function	no
fms_score_use_ref_mol2	Use a molecule to calculate pharmacophore reference	no
fms_score_ref_mol2_filename	molecule reference input file name.	Ph4.mol2
fms_score_use_ref_txt	Use a text format pharmacophore reference.	no
fms_score_ref_txt_filename	text reference input file name.	Ph4.txt
fms_score_write_reference_pharmacophore_mol2	Flag to write the reference pharmacophore model as a mol2 output file.	no
fms_score_write_reference_ph4_txt	Flag to write the reference pharmacophore model as a txt output file.	no
fms_score_reference_output_mol2_filename	reference pharmacophore mol2 output file name.	ref_ph4.mol2
fms_score_reference_output_txt_filename	Reference pharmacophore txt output file name.	ref_ph4.txt
fms_score_write_candidate_pharmacophore	Flag to write the candidate pharmacophore model as a mol2 output file.	no
fms_score_candidate_output_filename	Candidate pharmacophore output file name	cad_ph4.mol2
fms_score_write_matched_pharmacophore	Flag to write the matched pharmacophore model as a mol2 output file. The matched pharmacophore model, which is consist of pharmacophore points well-matched to any reference pharmacophore point, is a subset of the candidate pharmacophore model.	no
fms_score_matched_output_filename	matched pharmacophore output file name.	mat_ph4.mol2
fms_score_compare_type	Flag to determine comparison method between reference and candidate ph4. If overlap user is using a ligand-based reference for computing the FMS. When the value is 0 then there is a perfect overlap. When the value is negative then you have multi-matched ph4. When the value is positive then you have matches with residual. If compatible (This is under development and not currently available) user is using a receptor based reference for computing the FMS. When the value is X then there is a perfect overlap. When the value is Y then you have multi-matched ph4. When the value is Z then you have matches with residual.(Options: overlap, compatible)	overlap
fms_score_full_match	Flag to determine if full match is desired. Currently only full match is considered.	yes
fms_score_match_rate_weight	Specify the constant parameter k (weight on the match rate term) in FMS score	5
fms_score_match_proj_cutoff	Specify the scalar projection cutoff σ in the pharmacophore matching protocol. Default value cos(45 ° ) ≈ 0.7071 corresponds to a vector angle cutoff of 45 °	0.7071
fms_score_max_score	Specify the FMS score value for pharmacophore model pairs with no matches. This maximum FMS score depends on k, r and σ.	20