PRODIGY Nextflow Pipeline
A Nextflow pipeline for predicting binding affinity of protein-protein complexes using PRODIGY (PROtein binDIng enerGY prediction).
Overview
PRODIGY is a contact-based method for predicting the binding affinity of protein-protein complexes from their 3D structures. This pipeline containerizes PRODIGY using Docker and orchestrates execution through Nextflow, enabling reproducible, scalable analysis of protein-protein interactions.
Key Features
- Automated binding affinity prediction from PDB/mmCIF structures
- Batch processing of multiple protein complexes
- Docker containerization for reproducibility
- Configurable parameters for distance cutoffs, temperature, and chain selection
- Optional outputs including contact lists and PyMOL visualization scripts
Scientific Background
PRODIGY predicts binding affinity by analyzing intermolecular contacts (ICs) at protein-protein interfaces. The method:
- Identifies residue-residue contacts within a distance threshold (default: 5.5 Å)
- Classifies contacts by residue type (charged, polar, apolar)
- Analyzes the non-interacting surface (NIS) composition
- Predicts binding free energy (ΔG) and dissociation constant (Kd)
The 5.5 Å distance cutoff was optimized to capture various non-bonded interactions including salt bridges, hydrogen bonds, and hydrophobic contacts.
Requirements
Software Dependencies
- Nextflow (≥21.04.0)
- Docker (≥20.10) or Singularity (≥3.0)
Hardware Requirements
- CPU: 1+ cores per process
- Memory: 4 GB minimum recommended
- Storage: ~2 GB for Docker image
Installation
1. Clone or Download the Pipeline
# Create pipeline directory
mkdir -p /path/to/prodigy_pipeline
cd /path/to/prodigy_pipeline
# Copy pipeline files (Dockerfile, main.nf, nextflow.config, params.json)
2. Build the Docker Image
docker build -t prodigy:latest .
3. Verify Installation
# Test Docker image
docker run --rm prodigy:latest prodigy --help
# Test Nextflow
nextflow run main.nf --help
Usage
Basic Usage
# Run on a single PDB file
nextflow run main.nf --pdb /path/to/complex.pdb --outdir /path/to/output
# Run on multiple PDB files
nextflow run main.nf --pdb '/path/to/structures/*.pdb' --outdir /path/to/output
With Custom Parameters
nextflow run main.nf \
--pdb '/path/to/structures/*.pdb' \
--outdir /path/to/output \
--distance_cutoff 5.5 \
--acc_threshold 0.05 \
--temperature 37.0 \
--contact_list true \
--pymol_selection true
Chain Selection for Complex Interfaces
For antibody-antigen complexes or multi-chain proteins:
# Contacts between chains A and B only
nextflow run main.nf --pdb complex.pdb --selection 'A B'
# Heavy (H) and Light (L) chains as one molecule vs Antigen (A)
nextflow run main.nf --pdb antibody_antigen.pdb --selection 'H,L A'
# Three-way interface calculation
nextflow run main.nf --pdb complex.pdb --selection 'A B C'
Using Singularity
nextflow run main.nf -profile singularity --pdb /path/to/complex.pdb
Parameters
Required Parameters
| Parameter | Description | Default |
|---|---|---|
--pdb |
Path to input PDB/mmCIF file(s). Supports glob patterns. | /mnt/OmicNAS/private/old/olamide/Prodigy/input/*.pdb |
--outdir |
Output directory for results | /mnt/OmicNAS/private/old/olamide/Prodigy/output |
Analysis Parameters
| Parameter | Description | Default | Range |
|---|---|---|---|
--distance_cutoff |
Distance threshold (Å) for defining intermolecular contacts | 5.5 |
1.0 - 20.0 |
--acc_threshold |
Relative accessibility threshold for surface residue identification | 0.05 |
0.0 - 1.0 |
--temperature |
Temperature (°C) for Kd calculation | 25.0 |
-273.15 - 100.0 |
--selection |
Chain selection for interface calculation | '' (all chains) |
See examples |
Output Control Parameters
| Parameter | Description | Default |
|---|---|---|
--contact_list |
Generate detailed contact list file | false |
--pymol_selection |
Generate PyMOL visualization script | false |
--quiet |
Output only affinity values (minimal output) | false |
Output Files
Standard Output
For each input structure <name>.pdb, the pipeline generates:
| File | Description |
|---|---|
<name>_prodigy.txt |
Main results file with binding affinity prediction |
Optional Output (when enabled)
| File | Description | Parameter |
|---|---|---|
<name>_contacts.txt |
List of all interface contacts | --contact_list true |
<name>_interface.pml |
PyMOL script for interface visualization | --pymol_selection true |
Example Output
[!] Structure contains gaps:
E ILE16 < Fragment 0 > E ALA183
E TYR184 < Fragment 1 > E GLY187
[+] Executing 1 task(s) in total
##########################################
[+] Processing structure 1ppe_model0
[+] No. of intermolecular contacts: 86
[+] No. of charged-charged contacts: 5.0
[+] No. of charged-polar contacts: 10.0
[+] No. of charged-apolar contacts: 27.0
[+] No. of polar-polar contacts: 0.0
[+] No. of apolar-polar contacts: 20.0
[+] No. of apolar-apolar contacts: 24.0
[+] Percentage of apolar NIS residues: 34.10
[+] Percentage of charged NIS residues: 18.50
[++] Predicted binding affinity (kcal.mol-1): -14.7
[++] Predicted dissociation constant (M) at 25.0˚C: 1.6e-11
Output Interpretation
| Metric | Description |
|---|---|
| Intermolecular contacts | Total number of residue-residue contacts at interface |
| Contact types | Breakdown by residue character (charged/polar/apolar) |
| NIS residues | Composition of non-interacting surface |
| Binding affinity (ΔG) | Predicted free energy of binding (kcal/mol). More negative = stronger binding |
| Dissociation constant (Kd) | Predicted Kd at specified temperature. Lower = tighter binding |
Binding Affinity Scale
| ΔG (kcal/mol) | Kd (M) | Binding Strength |
|---|---|---|
| -6 to -8 | 10⁻⁵ to 10⁻⁶ | Moderate |
| -8 to -10 | 10⁻⁶ to 10⁻⁷ | Strong |
| -10 to -12 | 10⁻⁷ to 10⁻⁹ | Very Strong |
| < -12 | < 10⁻⁹ | Extremely Strong |
Test Data
Download example protein complexes from the RCSB PDB:
# Create input directory
mkdir -p /mnt/OmicNAS/private/old/olamide/Prodigy/input
# Download test structures
wget -O /mnt/OmicNAS/private/old/olamide/Prodigy/input/3bzd.pdb https://files.rcsb.org/download/3BZD.pdb
wget -O /mnt/OmicNAS/private/old/olamide/Prodigy/input/2oob.pdb https://files.rcsb.org/download/2OOB.pdb
wget -O /mnt/OmicNAS/private/old/olamide/Prodigy/input/1ppe.pdb https://files.rcsb.org/download/1PPE.pdb
Expected Results
| Structure | Description | Expected ΔG (kcal/mol) |
|---|---|---|
| 3BZD | Protein-protein complex | -9.4 |
| 2OOB | Protein-protein complex | -6.2 |
| 1PPE | Trypsin-inhibitor complex | -14.7 |
Pipeline Structure
prodigy_pipeline/
├── Dockerfile # Docker image definition
├── main.nf # Nextflow pipeline script
├── nextflow.config # Pipeline configuration
├── params.json # Parameter documentation
└── README.md # This file
Docker Image Details
The Docker image is based on Python 3.12 and includes:
- prodigy-prot (v2.4.0) - Main PRODIGY package
- biopython (≥1.80) - PDB structure parsing
- freesasa (≥2.2.1) - Solvent accessible surface area calculation
- numpy (≥2) - Numerical computations
Building the Image
docker build -t prodigy:latest .
Running Standalone
# Run PRODIGY directly
docker run --rm -v /path/to/data:/data prodigy:latest prodigy /data/complex.pdb
# Get help
docker run --rm prodigy:latest prodigy --help
Troubleshooting
Common Issues
1. Docker Hub Rate Limit Error
ERROR: toomanyrequests: You have reached your pull rate limit
Solution: Log in to Docker Hub with docker login or wait and retry.
2. Structure Contains Gaps Warning
[!] Structure contains gaps
This is informational, not an error. PRODIGY handles missing residues automatically.
3. No Intermolecular Contacts Found
- Verify the structure contains multiple chains
- Check chain selection parameters
- Ensure chains are in contact (within distance cutoff)
4. Permission Denied Errors
# Run with user permissions
docker run --rm -u $(id -u):$(id -g) -v /path/to/data:/data prodigy:latest prodigy /data/complex.pdb
Getting Help
# PRODIGY help
docker run --rm prodigy:latest prodigy --help
# Nextflow pipeline help
nextflow run main.nf --help
Citation
If you use this pipeline, please cite the following publications:
PRODIGY Method
-
Xue LC, Rodrigues JP, Kastritis PL, Bonvin AM, Vangone A. (2016) PRODIGY: a web server for predicting the binding affinity of protein-protein complexes. Bioinformatics, 32(23):3676-3678. DOI: 10.1093/bioinformatics/btw514
-
Vangone A, Bonvin AM. (2015) Contacts-based prediction of binding affinity in protein-protein complexes. eLife, 4:e07454. DOI: 10.7554/eLife.07454
-
Kastritis PL, Rodrigues JP, Folkers GE, Boelens R, Bonvin AM. (2014) Proteins feel more than they see: Fine-tuning of binding affinity by properties of the non-interacting surface. Journal of Molecular Biology, 426(14):2632-2652. DOI: 10.1016/j.jmb.2014.04.017
Software Dependencies
- Nextflow: Di Tommaso P, et al. (2017) Nextflow enables reproducible computational workflows. Nature Biotechnology, 35:316-319.
- Biopython: Cock PJ, et al. (2009) Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics, 25(11):1422-1423.
- FreeSASA: Mitternacht S. (2016) FreeSASA: An open source C library for solvent accessible surface area calculations. F1000Research, 5:189.
License
This pipeline is distributed under the Apache License 2.0, consistent with the PRODIGY software license.
Links
- PRODIGY Web Server: https://wenmr.science.uu.nl/prodigy/
- PRODIGY GitHub: https://github.com/haddocking/prodigy
- BonvinLab: https://www.bonvinlab.org/
- Nextflow: https://www.nextflow.io/
Support
For questions about:
- PRODIGY method: Contact the BonvinLab team at ask.bioexcel.eu
- This pipeline: Open an issue in the repository
Pipeline version: 2.4.0 | Last updated: January 2026