Olamide Isreal c8fd6b4084
Some checks failed
ci / test (3.10) (push) Has been cancelled
ci / test (3.11) (push) Has been cancelled
ci / test (3.12) (push) Has been cancelled
ci / test (3.13) (push) Has been cancelled
ci / test (3.9) (push) Has been cancelled
Close stale issues and PRs / stale (push) Has been cancelled
Fix input path to s3://omic/eureka/prodigy/*.pdb
2026-03-17 16:44:26 +01:00

PRODIGY Nextflow Pipeline

A Nextflow pipeline for predicting binding affinity of protein-protein complexes using PRODIGY (PROtein binDIng enerGY prediction).

Overview

PRODIGY is a contact-based method for predicting the binding affinity of protein-protein complexes from their 3D structures. This pipeline containerizes PRODIGY using Docker and orchestrates execution through Nextflow, enabling reproducible, scalable analysis of protein-protein interactions.

Key Features

  • Automated binding affinity prediction from PDB/mmCIF structures
  • Batch processing of multiple protein complexes
  • Docker containerization for reproducibility
  • Configurable parameters for distance cutoffs, temperature, and chain selection
  • Optional outputs including contact lists and PyMOL visualization scripts

Scientific Background

PRODIGY predicts binding affinity by analyzing intermolecular contacts (ICs) at protein-protein interfaces. The method:

  1. Identifies residue-residue contacts within a distance threshold (default: 5.5 Å)
  2. Classifies contacts by residue type (charged, polar, apolar)
  3. Analyzes the non-interacting surface (NIS) composition
  4. Predicts binding free energy (ΔG) and dissociation constant (Kd)

The 5.5 Å distance cutoff was optimized to capture various non-bonded interactions including salt bridges, hydrogen bonds, and hydrophobic contacts.

Requirements

Software Dependencies

Hardware Requirements

  • CPU: 1+ cores per process
  • Memory: 4 GB minimum recommended
  • Storage: ~2 GB for Docker image

Installation

1. Clone or Download the Pipeline

# Create pipeline directory
mkdir -p /path/to/prodigy_pipeline
cd /path/to/prodigy_pipeline

# Copy pipeline files (Dockerfile, main.nf, nextflow.config, params.json)

2. Build the Docker Image

docker build -t prodigy:latest .

3. Verify Installation

# Test Docker image
docker run --rm prodigy:latest prodigy --help

# Test Nextflow
nextflow run main.nf --help

Usage

Basic Usage

# Run on a single PDB file
nextflow run main.nf --pdb /path/to/complex.pdb --outdir /path/to/output

# Run on multiple PDB files
nextflow run main.nf --pdb '/path/to/structures/*.pdb' --outdir /path/to/output

With Custom Parameters

nextflow run main.nf \
    --pdb '/path/to/structures/*.pdb' \
    --outdir /path/to/output \
    --distance_cutoff 5.5 \
    --acc_threshold 0.05 \
    --temperature 37.0 \
    --contact_list true \
    --pymol_selection true

Chain Selection for Complex Interfaces

For antibody-antigen complexes or multi-chain proteins:

# Contacts between chains A and B only
nextflow run main.nf --pdb complex.pdb --selection 'A B'

# Heavy (H) and Light (L) chains as one molecule vs Antigen (A)
nextflow run main.nf --pdb antibody_antigen.pdb --selection 'H,L A'

# Three-way interface calculation
nextflow run main.nf --pdb complex.pdb --selection 'A B C'

Using Singularity

nextflow run main.nf -profile singularity --pdb /path/to/complex.pdb

Parameters

Required Parameters

Parameter Description Default
--pdb Path to input PDB/mmCIF file(s). Supports glob patterns. /mnt/OmicNAS/private/old/olamide/Prodigy/input/*.pdb
--outdir Output directory for results /mnt/OmicNAS/private/old/olamide/Prodigy/output

Analysis Parameters

Parameter Description Default Range
--distance_cutoff Distance threshold (Å) for defining intermolecular contacts 5.5 1.0 - 20.0
--acc_threshold Relative accessibility threshold for surface residue identification 0.05 0.0 - 1.0
--temperature Temperature (°C) for Kd calculation 25.0 -273.15 - 100.0
--selection Chain selection for interface calculation '' (all chains) See examples

Output Control Parameters

Parameter Description Default
--contact_list Generate detailed contact list file false
--pymol_selection Generate PyMOL visualization script false
--quiet Output only affinity values (minimal output) false

Output Files

Standard Output

For each input structure <name>.pdb, the pipeline generates:

File Description
<name>_prodigy.txt Main results file with binding affinity prediction

Optional Output (when enabled)

File Description Parameter
<name>_contacts.txt List of all interface contacts --contact_list true
<name>_interface.pml PyMOL script for interface visualization --pymol_selection true

Example Output

[!] Structure contains gaps:
    E ILE16 < Fragment 0 > E ALA183
    E TYR184 < Fragment 1 > E GLY187

[+] Executing 1 task(s) in total
##########################################
[+] Processing structure 1ppe_model0
[+] No. of intermolecular contacts: 86
[+] No. of charged-charged contacts: 5.0
[+] No. of charged-polar contacts: 10.0
[+] No. of charged-apolar contacts: 27.0
[+] No. of polar-polar contacts: 0.0
[+] No. of apolar-polar contacts: 20.0
[+] No. of apolar-apolar contacts: 24.0
[+] Percentage of apolar NIS residues: 34.10
[+] Percentage of charged NIS residues: 18.50
[++] Predicted binding affinity (kcal.mol-1):    -14.7
[++] Predicted dissociation constant (M) at 25.0˚C:  1.6e-11

Output Interpretation

Metric Description
Intermolecular contacts Total number of residue-residue contacts at interface
Contact types Breakdown by residue character (charged/polar/apolar)
NIS residues Composition of non-interacting surface
Binding affinity (ΔG) Predicted free energy of binding (kcal/mol). More negative = stronger binding
Dissociation constant (Kd) Predicted Kd at specified temperature. Lower = tighter binding

Binding Affinity Scale

ΔG (kcal/mol) Kd (M) Binding Strength
-6 to -8 10⁻⁵ to 10⁻⁶ Moderate
-8 to -10 10⁻⁶ to 10⁻⁷ Strong
-10 to -12 10⁻⁷ to 10⁻⁹ Very Strong
< -12 < 10⁻⁹ Extremely Strong

Test Data

Download example protein complexes from the RCSB PDB:

# Create input directory
mkdir -p /mnt/OmicNAS/private/old/olamide/Prodigy/input

# Download test structures
wget -O /mnt/OmicNAS/private/old/olamide/Prodigy/input/3bzd.pdb https://files.rcsb.org/download/3BZD.pdb
wget -O /mnt/OmicNAS/private/old/olamide/Prodigy/input/2oob.pdb https://files.rcsb.org/download/2OOB.pdb
wget -O /mnt/OmicNAS/private/old/olamide/Prodigy/input/1ppe.pdb https://files.rcsb.org/download/1PPE.pdb

Expected Results

Structure Description Expected ΔG (kcal/mol)
3BZD Protein-protein complex -9.4
2OOB Protein-protein complex -6.2
1PPE Trypsin-inhibitor complex -14.7

Pipeline Structure

prodigy_pipeline/
├── Dockerfile          # Docker image definition
├── main.nf             # Nextflow pipeline script
├── nextflow.config     # Pipeline configuration
├── params.json         # Parameter documentation
└── README.md           # This file

Docker Image Details

The Docker image is based on Python 3.12 and includes:

  • prodigy-prot (v2.4.0) - Main PRODIGY package
  • biopython (≥1.80) - PDB structure parsing
  • freesasa (≥2.2.1) - Solvent accessible surface area calculation
  • numpy (≥2) - Numerical computations

Building the Image

docker build -t prodigy:latest .

Running Standalone

# Run PRODIGY directly
docker run --rm -v /path/to/data:/data prodigy:latest prodigy /data/complex.pdb

# Get help
docker run --rm prodigy:latest prodigy --help

Troubleshooting

Common Issues

1. Docker Hub Rate Limit Error

ERROR: toomanyrequests: You have reached your pull rate limit

Solution: Log in to Docker Hub with docker login or wait and retry.

2. Structure Contains Gaps Warning

[!] Structure contains gaps

This is informational, not an error. PRODIGY handles missing residues automatically.

3. No Intermolecular Contacts Found

  • Verify the structure contains multiple chains
  • Check chain selection parameters
  • Ensure chains are in contact (within distance cutoff)

4. Permission Denied Errors

# Run with user permissions
docker run --rm -u $(id -u):$(id -g) -v /path/to/data:/data prodigy:latest prodigy /data/complex.pdb

Getting Help

# PRODIGY help
docker run --rm prodigy:latest prodigy --help

# Nextflow pipeline help
nextflow run main.nf --help

Citation

If you use this pipeline, please cite the following publications:

PRODIGY Method

  1. Xue LC, Rodrigues JP, Kastritis PL, Bonvin AM, Vangone A. (2016) PRODIGY: a web server for predicting the binding affinity of protein-protein complexes. Bioinformatics, 32(23):3676-3678. DOI: 10.1093/bioinformatics/btw514

  2. Vangone A, Bonvin AM. (2015) Contacts-based prediction of binding affinity in protein-protein complexes. eLife, 4:e07454. DOI: 10.7554/eLife.07454

  3. Kastritis PL, Rodrigues JP, Folkers GE, Boelens R, Bonvin AM. (2014) Proteins feel more than they see: Fine-tuning of binding affinity by properties of the non-interacting surface. Journal of Molecular Biology, 426(14):2632-2652. DOI: 10.1016/j.jmb.2014.04.017

Software Dependencies

  • Nextflow: Di Tommaso P, et al. (2017) Nextflow enables reproducible computational workflows. Nature Biotechnology, 35:316-319.
  • Biopython: Cock PJ, et al. (2009) Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics, 25(11):1422-1423.
  • FreeSASA: Mitternacht S. (2016) FreeSASA: An open source C library for solvent accessible surface area calculations. F1000Research, 5:189.

License

This pipeline is distributed under the Apache License 2.0, consistent with the PRODIGY software license.

Support

For questions about:

  • PRODIGY method: Contact the BonvinLab team at ask.bioexcel.eu
  • This pipeline: Open an issue in the repository

Pipeline version: 2.4.0 | Last updated: January 2026

Description
PRODIGY - Nextflow pipeline for protein-protein binding affinity prediction
Readme Cite this repository 307 KiB
Languages
Python 88.5%
Nextflow 9.4%
Dockerfile 2.1%