# PRODIGY Nextflow Pipeline A Nextflow pipeline for predicting binding affinity of protein-protein complexes using PRODIGY (PROtein binDIng enerGY prediction). ## Overview PRODIGY is a contact-based method for predicting the binding affinity of protein-protein complexes from their 3D structures. This pipeline containerizes PRODIGY using Docker and orchestrates execution through Nextflow, enabling reproducible, scalable analysis of protein-protein interactions. ### Key Features - **Automated binding affinity prediction** from PDB/mmCIF structures - **Batch processing** of multiple protein complexes - **Docker containerization** for reproducibility - **Configurable parameters** for distance cutoffs, temperature, and chain selection - **Optional outputs** including contact lists and PyMOL visualization scripts ## Scientific Background PRODIGY predicts binding affinity by analyzing intermolecular contacts (ICs) at protein-protein interfaces. The method: 1. Identifies residue-residue contacts within a distance threshold (default: 5.5 Å) 2. Classifies contacts by residue type (charged, polar, apolar) 3. Analyzes the non-interacting surface (NIS) composition 4. Predicts binding free energy (ΔG) and dissociation constant (Kd) The 5.5 Å distance cutoff was optimized to capture various non-bonded interactions including salt bridges, hydrogen bonds, and hydrophobic contacts. ## Requirements ### Software Dependencies - [Nextflow](https://www.nextflow.io/) (≥21.04.0) - [Docker](https://www.docker.com/) (≥20.10) or [Singularity](https://sylabs.io/singularity/) (≥3.0) ### Hardware Requirements - CPU: 1+ cores per process - Memory: 4 GB minimum recommended - Storage: ~2 GB for Docker image ## Installation ### 1. Clone or Download the Pipeline ```bash # Create pipeline directory mkdir -p /path/to/prodigy_pipeline cd /path/to/prodigy_pipeline # Copy pipeline files (Dockerfile, main.nf, nextflow.config, params.json) ``` ### 2. Build the Docker Image ```bash docker build -t prodigy:latest . ``` ### 3. Verify Installation ```bash # Test Docker image docker run --rm prodigy:latest prodigy --help # Test Nextflow nextflow run main.nf --help ``` ## Usage ### Basic Usage ```bash # Run on a single PDB file nextflow run main.nf --pdb /path/to/complex.pdb --outdir /path/to/output # Run on multiple PDB files nextflow run main.nf --pdb '/path/to/structures/*.pdb' --outdir /path/to/output ``` ### With Custom Parameters ```bash nextflow run main.nf \ --pdb '/path/to/structures/*.pdb' \ --outdir /path/to/output \ --distance_cutoff 5.5 \ --acc_threshold 0.05 \ --temperature 37.0 \ --contact_list true \ --pymol_selection true ``` ### Chain Selection for Complex Interfaces For antibody-antigen complexes or multi-chain proteins: ```bash # Contacts between chains A and B only nextflow run main.nf --pdb complex.pdb --selection 'A B' # Heavy (H) and Light (L) chains as one molecule vs Antigen (A) nextflow run main.nf --pdb antibody_antigen.pdb --selection 'H,L A' # Three-way interface calculation nextflow run main.nf --pdb complex.pdb --selection 'A B C' ``` ### Using Singularity ```bash nextflow run main.nf -profile singularity --pdb /path/to/complex.pdb ``` ## Parameters ### Required Parameters | Parameter | Description | Default | |-----------|-------------|---------| | `--pdb` | Path to input PDB/mmCIF file(s). Supports glob patterns. | `/mnt/OmicNAS/private/old/olamide/Prodigy/input/*.pdb` | | `--outdir` | Output directory for results | `/mnt/OmicNAS/private/old/olamide/Prodigy/output` | ### Analysis Parameters | Parameter | Description | Default | Range | |-----------|-------------|---------|-------| | `--distance_cutoff` | Distance threshold (Å) for defining intermolecular contacts | `5.5` | 1.0 - 20.0 | | `--acc_threshold` | Relative accessibility threshold for surface residue identification | `0.05` | 0.0 - 1.0 | | `--temperature` | Temperature (°C) for Kd calculation | `25.0` | -273.15 - 100.0 | | `--selection` | Chain selection for interface calculation | `''` (all chains) | See examples | ### Output Control Parameters | Parameter | Description | Default | |-----------|-------------|---------| | `--contact_list` | Generate detailed contact list file | `false` | | `--pymol_selection` | Generate PyMOL visualization script | `false` | | `--quiet` | Output only affinity values (minimal output) | `false` | ## Output Files ### Standard Output For each input structure `.pdb`, the pipeline generates: | File | Description | |------|-------------| | `_prodigy.txt` | Main results file with binding affinity prediction | ### Optional Output (when enabled) | File | Description | Parameter | |------|-------------|-----------| | `_contacts.txt` | List of all interface contacts | `--contact_list true` | | `_interface.pml` | PyMOL script for interface visualization | `--pymol_selection true` | ### Example Output ``` [!] Structure contains gaps: E ILE16 < Fragment 0 > E ALA183 E TYR184 < Fragment 1 > E GLY187 [+] Executing 1 task(s) in total ########################################## [+] Processing structure 1ppe_model0 [+] No. of intermolecular contacts: 86 [+] No. of charged-charged contacts: 5.0 [+] No. of charged-polar contacts: 10.0 [+] No. of charged-apolar contacts: 27.0 [+] No. of polar-polar contacts: 0.0 [+] No. of apolar-polar contacts: 20.0 [+] No. of apolar-apolar contacts: 24.0 [+] Percentage of apolar NIS residues: 34.10 [+] Percentage of charged NIS residues: 18.50 [++] Predicted binding affinity (kcal.mol-1): -14.7 [++] Predicted dissociation constant (M) at 25.0˚C: 1.6e-11 ``` ### Output Interpretation | Metric | Description | |--------|-------------| | **Intermolecular contacts** | Total number of residue-residue contacts at interface | | **Contact types** | Breakdown by residue character (charged/polar/apolar) | | **NIS residues** | Composition of non-interacting surface | | **Binding affinity (ΔG)** | Predicted free energy of binding (kcal/mol). More negative = stronger binding | | **Dissociation constant (Kd)** | Predicted Kd at specified temperature. Lower = tighter binding | ### Binding Affinity Scale | ΔG (kcal/mol) | Kd (M) | Binding Strength | |---------------|--------|------------------| | -6 to -8 | 10⁻⁵ to 10⁻⁶ | Moderate | | -8 to -10 | 10⁻⁶ to 10⁻⁷ | Strong | | -10 to -12 | 10⁻⁷ to 10⁻⁹ | Very Strong | | < -12 | < 10⁻⁹ | Extremely Strong | ## Test Data Download example protein complexes from the RCSB PDB: ```bash # Create input directory mkdir -p /mnt/OmicNAS/private/old/olamide/Prodigy/input # Download test structures wget -O /mnt/OmicNAS/private/old/olamide/Prodigy/input/3bzd.pdb https://files.rcsb.org/download/3BZD.pdb wget -O /mnt/OmicNAS/private/old/olamide/Prodigy/input/2oob.pdb https://files.rcsb.org/download/2OOB.pdb wget -O /mnt/OmicNAS/private/old/olamide/Prodigy/input/1ppe.pdb https://files.rcsb.org/download/1PPE.pdb ``` ### Expected Results | Structure | Description | Expected ΔG (kcal/mol) | |-----------|-------------|------------------------| | 3BZD | Protein-protein complex | -9.4 | | 2OOB | Protein-protein complex | -6.2 | | 1PPE | Trypsin-inhibitor complex | -14.7 | ## Pipeline Structure ``` prodigy_pipeline/ ├── Dockerfile # Docker image definition ├── main.nf # Nextflow pipeline script ├── nextflow.config # Pipeline configuration ├── params.json # Parameter documentation └── README.md # This file ``` ## Docker Image Details The Docker image is based on Python 3.12 and includes: - **prodigy-prot** (v2.4.0) - Main PRODIGY package - **biopython** (≥1.80) - PDB structure parsing - **freesasa** (≥2.2.1) - Solvent accessible surface area calculation - **numpy** (≥2) - Numerical computations ### Building the Image ```bash docker build -t prodigy:latest . ``` ### Running Standalone ```bash # Run PRODIGY directly docker run --rm -v /path/to/data:/data prodigy:latest prodigy /data/complex.pdb # Get help docker run --rm prodigy:latest prodigy --help ``` ## Troubleshooting ### Common Issues **1. Docker Hub Rate Limit Error** ``` ERROR: toomanyrequests: You have reached your pull rate limit ``` Solution: Log in to Docker Hub with `docker login` or wait and retry. **2. Structure Contains Gaps Warning** ``` [!] Structure contains gaps ``` This is informational, not an error. PRODIGY handles missing residues automatically. **3. No Intermolecular Contacts Found** - Verify the structure contains multiple chains - Check chain selection parameters - Ensure chains are in contact (within distance cutoff) **4. Permission Denied Errors** ```bash # Run with user permissions docker run --rm -u $(id -u):$(id -g) -v /path/to/data:/data prodigy:latest prodigy /data/complex.pdb ``` ### Getting Help ```bash # PRODIGY help docker run --rm prodigy:latest prodigy --help # Nextflow pipeline help nextflow run main.nf --help ``` ## Citation If you use this pipeline, please cite the following publications: ### PRODIGY Method 1. **Xue LC, Rodrigues JP, Kastritis PL, Bonvin AM, Vangone A.** (2016) PRODIGY: a web server for predicting the binding affinity of protein-protein complexes. *Bioinformatics*, 32(23):3676-3678. [DOI: 10.1093/bioinformatics/btw514](https://doi.org/10.1093/bioinformatics/btw514) 2. **Vangone A, Bonvin AM.** (2015) Contacts-based prediction of binding affinity in protein-protein complexes. *eLife*, 4:e07454. [DOI: 10.7554/eLife.07454](https://doi.org/10.7554/eLife.07454) 3. **Kastritis PL, Rodrigues JP, Folkers GE, Boelens R, Bonvin AM.** (2014) Proteins feel more than they see: Fine-tuning of binding affinity by properties of the non-interacting surface. *Journal of Molecular Biology*, 426(14):2632-2652. [DOI: 10.1016/j.jmb.2014.04.017](https://doi.org/10.1016/j.jmb.2014.04.017) ### Software Dependencies - **Nextflow**: Di Tommaso P, et al. (2017) Nextflow enables reproducible computational workflows. *Nature Biotechnology*, 35:316-319. - **Biopython**: Cock PJ, et al. (2009) Biopython: freely available Python tools for computational molecular biology and bioinformatics. *Bioinformatics*, 25(11):1422-1423. - **FreeSASA**: Mitternacht S. (2016) FreeSASA: An open source C library for solvent accessible surface area calculations. *F1000Research*, 5:189. ## License This pipeline is distributed under the Apache License 2.0, consistent with the PRODIGY software license. ## Links - **PRODIGY Web Server**: [https://wenmr.science.uu.nl/prodigy/](https://wenmr.science.uu.nl/prodigy/) - **PRODIGY GitHub**: [https://github.com/haddocking/prodigy](https://github.com/haddocking/prodigy) - **BonvinLab**: [https://www.bonvinlab.org/](https://www.bonvinlab.org/) - **Nextflow**: [https://www.nextflow.io/](https://www.nextflow.io/) ## Support For questions about: - **PRODIGY method**: Contact the BonvinLab team at [ask.bioexcel.eu](https://ask.bioexcel.eu/) - **This pipeline**: Open an issue in the repository --- *Pipeline version: 2.4.0 | Last updated: January 2026*