AlphaFold2 is a GPU-accelerated neural network algorithm for accurate prediction of protein 3D structures from amino acid sequences, achieving atomic-level precision close to experimental methods (median backbone accuracy of 0.96 Å Cα r.m.s.d.95 on CASP14). It utilizes multi-sequence alignments (MSAs) and structural templates to produce full-length predictions along with per-residue confidence metrics (pLDDT, pTM). AlphaFold2 enables large-scale structure annotation, protein engineering, and structural biology research through scalable API access provided by BioLM.

Predict

Predicts 3D structure for a protein sequence

python
from biolmai import BioLM
response = BioLM(
    entity="alphafold2",
    action="predict",
    params={
      "databases": [
        "mgnify",
        "small_bfd",
        "uniref90"
      ],
      "predictions_per_model": 1,
      "relax": "none",
      "return_templates": true,
      "msa_iterations": 1,
      "max_msa_sequences": 1000,
      "algorithm": "mmseqs2"
    },
    items=[
      {
        "sequence": "MAAAAAAGAGPEMVRGQVFDVGPR"
      }
    ]
)
print(response)
bash
curl -X POST https://biolm.ai/api/v3/alphafold2/predict/ \
  -H "Authorization: Token YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "items": [
    {
      "sequence": "MAAAAAAGAGPEMVRGQVFDVGPR"
    }
  ],
  "params": {
    "databases": [
      "mgnify",
      "small_bfd",
      "uniref90"
    ],
    "predictions_per_model": 1,
    "relax": "none",
    "return_templates": true,
    "msa_iterations": 1,
    "max_msa_sequences": 1000,
    "algorithm": "mmseqs2"
  }
}'
python
import requests

url = "https://biolm.ai/api/v3/alphafold2/predict/"
headers = {
    "Authorization": "Token YOUR_API_KEY",
    "Content-Type": "application/json"
}
payload = {
      "items": [
        {
          "sequence": "MAAAAAAGAGPEMVRGQVFDVGPR"
        }
      ],
      "params": {
        "databases": [
          "mgnify",
          "small_bfd",
          "uniref90"
        ],
        "predictions_per_model": 1,
        "relax": "none",
        "return_templates": true,
        "msa_iterations": 1,
        "max_msa_sequences": 1000,
        "algorithm": "mmseqs2"
      }
    }

response = requests.post(url, headers=headers, json=payload)
print(response.json())
r
library(httr)

url <- "https://biolm.ai/api/v3/alphafold2/predict/"
headers <- c("Authorization" = "Token YOUR_API_KEY", "Content-Type" = "application/json")
body <- list(
  items = list(
    list(
      sequence = "MAAAAAAGAGPEMVRGQVFDVGPR"
    )
  ),
  params = list(
    databases = list(
      "mgnify",
      "small_bfd",
      "uniref90"
    ),
    predictions_per_model = 1,
    relax = "none",
    return_templates = TRUE,
    msa_iterations = 1,
    max_msa_sequences = 1000,
    algorithm = "mmseqs2"
  )
)

res <- POST(url, add_headers(.headers = headers), body = body, encode = "json")
print(content(res))
POST /api/v3/alphafold2/predict/

Predict endpoint for AlphaFold2.

Request Headers:

Request

  • params (object)

    • databases (array of strings, default: [“mgnify”, “small_bfd”, “uniref90”]) — Allowed values: “small_bfd”, “mgnify”, “uniref90”

    • predictions_per_model (integer, default: 1, range: 1–8) — Number of predictions to generate per model

    • relax (string, default: “none”) — Allowed values: “all”, “best”, “none”

    • return_templates (boolean, default: true) — Whether to include template data in the output

    • msa_iterations (integer, default: 1, range: 1–5) — Number of MSA refinement iterations

    • max_msa_sequences (integer, optional, default: null, range: 1–4000) — Maximum number of MSA sequences

    • algorithm (string, default: “mmseqs2”) — Allowed value: “mmseqs2”

  • items (array of objects, min: 1, max: 1)

    • sequence (string, min length: 1, max length: 512) — Protein sequence with extended amino acid validation

Example request:

http
POST /api/v3/alphafold2/predict/ HTTP/1.1
Host: biolm.ai
Authorization: Token YOUR_API_KEY
Content-Type: application/json

      {
  "items": [
    {
      "sequence": "MAAAAAAGAGPEMVRGQVFDVGPR"
    }
  ],
  "params": {
    "databases": [
      "mgnify",
      "small_bfd",
      "uniref90"
    ],
    "predictions_per_model": 1,
    "relax": "none",
    "return_templates": true,
    "msa_iterations": 1,
    "max_msa_sequences": 1000,
    "algorithm": "mmseqs2"
  }
}
Status Codes:

Response

  • results (array of objects) — One result per input item, in the order requested:

    • pdbs (array of strings, size: 1–8) — Predicted protein structures in PDB format

Example response:

http
HTTP/1.1 200 OK
Content-Type: application/json

      {
  "results": [
    {
      "pdbs": [
        "ATOM      1  N   MET A   1      24.392  -8.460 -14.399  1.00 47.22           N  \nATOM      2  CA  MET A   1      24.158  -8.618 -12.966  1.00 47.22           C  \nATOM      3  C   MET A   1      23.267... (truncated for documentation)",
        "ATOM      1  N   MET A   1      16.309  -0.252 -21.014  1.00 46.71           N  \nATOM      2  CA  MET A   1      15.302  -1.265 -20.707  1.00 46.71           C  \nATOM      3  C   MET A   1      14.836... (truncated for documentation)",
        "... (truncated for documentation)"
      ]
    }
  ]
}

Encode

Get MSAs for a protein sequence - compatible with AlphaFold2, Chai1, and other models

python
from biolmai import BioLM
response = BioLM(
    entity="alphafold2",
    action="encode",
    params={
      "databases": [
        "mgnify",
        "small_bfd",
        "uniref90"
      ],
      "return_templates": false,
      "msa_iterations": 2,
      "max_msa_sequences": 1000,
      "algorithm": "mmseqs2"
    },
    items=[
      {
        "sequence": "YDNVNKVRVAIKKISPFEHQGQVDVTYAMK"
      }
    ]
)
print(response)
bash
curl -X POST https://biolm.ai/api/v3/alphafold2/encode/ \
  -H "Authorization: Token YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "items": [
    {
      "sequence": "YDNVNKVRVAIKKISPFEHQGQVDVTYAMK"
    }
  ],
  "params": {
    "databases": [
      "mgnify",
      "small_bfd",
      "uniref90"
    ],
    "return_templates": false,
    "msa_iterations": 2,
    "max_msa_sequences": 1000,
    "algorithm": "mmseqs2"
  }
}'
python
import requests

url = "https://biolm.ai/api/v3/alphafold2/encode/"
headers = {
    "Authorization": "Token YOUR_API_KEY",
    "Content-Type": "application/json"
}
payload = {
      "items": [
        {
          "sequence": "YDNVNKVRVAIKKISPFEHQGQVDVTYAMK"
        }
      ],
      "params": {
        "databases": [
          "mgnify",
          "small_bfd",
          "uniref90"
        ],
        "return_templates": false,
        "msa_iterations": 2,
        "max_msa_sequences": 1000,
        "algorithm": "mmseqs2"
      }
    }

response = requests.post(url, headers=headers, json=payload)
print(response.json())
r
library(httr)

url <- "https://biolm.ai/api/v3/alphafold2/encode/"
headers <- c("Authorization" = "Token YOUR_API_KEY", "Content-Type" = "application/json")
body <- list(
  items = list(
    list(
      sequence = "YDNVNKVRVAIKKISPFEHQGQVDVTYAMK"
    )
  ),
  params = list(
    databases = list(
      "mgnify",
      "small_bfd",
      "uniref90"
    ),
    return_templates = FALSE,
    msa_iterations = 2,
    max_msa_sequences = 1000,
    algorithm = "mmseqs2"
  )
)

res <- POST(url, add_headers(.headers = headers), body = body, encode = "json")
print(content(res))
POST /api/v3/alphafold2/encode/

Encode endpoint for AlphaFold2.

Request Headers:

Request

  • params (object) — Configuration parameters:

    • databases (array of strings, default: [“mgnify”, “small_bfd”, “uniref90”], possible values: “small_bfd”, “mgnify”, “uniref90”) — List of databases

    • return_templates (boolean, default: true) — Whether to include template information

    • msa_iterations (integer, range: 1..5, default: 1) — Number of MSA search iterations

    • max_msa_sequences (integer, range: 1..4000, optional) — Maximum number of MSA sequences

    • algorithm (string, default: “mmseqs2”, possible values: “mmseqs2”) — MSA search algorithm

  • items (array of objects, min items: 1, max items: 1) — Input data:

    • sequence (string, min length: 1, max length: 512, required) — Sequence data

Example request:

http
POST /api/v3/alphafold2/encode/ HTTP/1.1
Host: biolm.ai
Authorization: Token YOUR_API_KEY
Content-Type: application/json

      {
  "items": [
    {
      "sequence": "YDNVNKVRVAIKKISPFEHQGQVDVTYAMK"
    }
  ],
  "params": {
    "databases": [
      "mgnify",
      "small_bfd",
      "uniref90"
    ],
    "return_templates": false,
    "msa_iterations": 2,
    "max_msa_sequences": 1000,
    "algorithm": "mmseqs2"
  }
}
Status Codes:

Response

  • results (array of objects) ― One result per input item, in the order requested:

    • alignments (object) ― Contains alignment data

      • small_bfd (array of strings, optional) ― Aligned sequences from the small_bfd database

      • mgnify (array of strings, optional) ― Aligned sequences from the mgnify database

      • uniref90 (array of strings, optional) ― Aligned sequences from the uniref90 database

    • templates (array of objects, optional) ― Contains template hit data

      • index (integer) ― Template index

      • name (string) ― Template name

      • aligned_cols (integer) ― Number of aligned columns

      • sum_probs (float) ― Summation of alignment probabilities

      • query (string) ― Query subsequence

      • hit_sequence (string) ― Template subsequence

      • indices_query (array of integers) ― Query residue positions

      • indices_hit (array of integers) ― Template residue positions

Example response:

http
HTTP/1.1 200 OK
Content-Type: application/json

      {
  "results": [
    {
      "alignments": {
        "small_bfd": [
          "small_bfd",
          ">DNHYJGOIDVCTWIW\nYDNVNKVRVAIKKISPFEHQGQVDVTYAMK\n",
          "a3m"
        ],
        "mgnify": [
          "mgnify",
          ">CZPVQDXKUAWVQDB\nYDNVNKVRVAIKKISPFEHQGQVDVTYAMK\n",
          "a3m"
        ],
        "uniref90": [
          "uniref90",
          ">GCAAJQDKIOCJTJR\nYDNVNKVRVAIKKISPFEHQGQVDVTYAMK\n",
          "a3m"
        ]
      },
      "templates": []
    }
  ]
}

Performance

  • AlphaFold2 provides highly accurate protein structure predictions, significantly outperforming other structure prediction models such as ESMFold and ABodyBuilder3 pLDDT in terms of atomic-level accuracy, particularly for complex proteins and novel folds.

  • For general protein structure prediction tasks, AlphaFold2 achieves a median backbone accuracy of approximately 0.96 Å RMSD95 (Cα root-mean-square deviation at 95% residue coverage), compared to around 2.8 Å RMSD95 for the next best-performing methods.

  • AlphaFold2 produces more accurate side-chain conformations than ESMFold and ABodyBuilder3 pLDDT, especially when backbone predictions are highly accurate, with median all-atom accuracy of approximately 1.5 Å RMSD95.

  • While AlphaFold2 is more accurate than ESMFold, it is computationally more demanding, requiring significantly more GPU resources and processing time per prediction.

  • AlphaFold2 is optimized by BioLM for GPU-accelerated inference, leveraging NVIDIA A100 GPUs with 80GB VRAM, typically utilizing three GPUs per prediction task.

  • AlphaFold2 employs Mmseqs2 for multiple sequence alignment (MSA) generation, providing faster alignment performance compared to Jackhmmer, thus reducing the overall computational time required for predictions.

  • The accuracy of AlphaFold2 predictions strongly depends on the depth of the multiple sequence alignment (MSA); optimal accuracy is achieved with an MSA depth of at least 30 sequences, with diminishing returns beyond approximately 100 sequences.

  • AlphaFold2 performs less effectively than specialized models such as NanoBodyBuilder2 when predicting nanobody structures, as NanoBodyBuilder2 is specifically optimized for the unique structural characteristics of nanobodies.

  • AlphaFold2’s accuracy decreases for proteins with many heterotypic (cross-chain) interactions, whereas it excels at predicting structures of single-chain proteins and homomeric complexes with extensive intra-chain or homotypic contacts.

  • BioLM’s deployment of AlphaFold2 includes optimizations such as iterative recycling and equivariant attention mechanisms, ensuring robust and reliable predictions across diverse protein families.

Applications

  • Rapid prediction of protein structures directly from amino acid sequences, enabling faster iteration cycles in protein engineering projects by eliminating the need for time-intensive experimental structure determination.

  • High-resolution modeling of protein backbones and side-chains to inform rational mutagenesis strategies, improving the accuracy of protein stability predictions and functional site identification for commercial enzyme optimization.

  • Accurate structural characterization of novel protein scaffolds or domains lacking homologous templates, supporting design workflows for synthetic biology applications such as biosensors or protein-based materials.

  • In silico identification of structurally stable protein variants to prioritize candidates for experimental screening, significantly reducing laboratory resource requirements and accelerating timelines in protein therapeutic development.

  • Reliable prediction of protein-protein interaction interfaces to guide engineering of fusion proteins or multi-domain constructs, though predictions may be less accurate for proteins relying heavily on heteromeric interactions or external cofactors for structural stability.

Limitations

  • Maximum Sequence Length: The AlphaFold2 API accepts protein sequences up to 512 amino acids. Longer sequences must be truncated or split into smaller segments.

  • Batch Size: The API supports a maximum batch size of 1 sequence per request. For multiple sequences, submit separate requests.

  • MSA Depth and Quality: Prediction accuracy depends heavily on the quality and depth of the multiple sequence alignment (MSA). Accuracy significantly decreases if the median alignment depth is below approximately 30 sequences. The API allows up to 4000 sequences in the MSA; however, shallow alignments may yield less accurate predictions.

  • Protein Complexes and Inter-chain Contacts: AlphaFold2 is optimized for predicting single-chain protein structures or homomeric complexes. It performs poorly on proteins whose structure is primarily defined by interactions with other distinct chains (heteromeric complexes). For proteins with extensive heterotypic contacts, consider alternative modeling approaches.

  • Computational Cost and Runtime: AlphaFold2 is computationally intensive, especially for sequences approaching the maximum length limit. Prediction runtimes may be several hours, particularly with increased msa_iterations or when requesting structural relaxation (relax set to all or best).

  • Structural Relaxation: Structural relaxation (relax parameter) can improve stereochemical quality but does not typically enhance prediction accuracy. It introduces additional computational overhead and may not be necessary for all use cases.

How We Use It

BioLM integrates AlphaFold2 predictions into our protein engineering workflows to accelerate molecular design and optimization, providing accurate structural insights that inform downstream predictive modeling and generative design tasks. By leveraging AlphaFold2’s atomic-level predictions, we enable rapid assessment of structural hypotheses, enhance selection of promising candidates, and significantly reduce experimental iterations. The standardized API facilitates seamless integration with our predictive and generative models, embedding analysis of predicted structures into automated ranking, filtering, and optimization pipelines.

  • Provides structural context to refine predictive models and improve hit rates for engineered proteins.

  • Integrates seamlessly with generative AI workflows and structure-based property predictions, reducing time-to-market for designed molecules.

References

  • Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Žídek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S. A. A., Ballard, A. J., Cowie, A., Romera-Paredes, B., Nikolov, S., Jain, R., Adler, J., Back, T., Petersen, S., Reiman, D., Clancy, E., Zielinski, M., Steinegger, M., Pacholska, M., Berghammer, T., Bodenstein, S., Silver, D., Vinyals, O., Senior, A. W., Kavukcuoglu, K., Kohli, P., & Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature.