TCRBuilder2 is a GPU-accelerated deep learning algorithm for rapid, accurate prediction of T-cell receptor (TCR) 3D structures directly from amino acid sequences. Trained specifically on TCR structural data, TCRBuilder2 achieves comparable accuracy to AlphaFold2 (average CDR RMSD ~1.4–2.0Å) but with a significant speed advantage, generating structures in seconds without requiring large sequence databases or MSAs. Ideal use cases include high-throughput structural annotation, antigen specificity analysis, and therapeutic TCR engineering.

Predict

Predict TCR structure from alpha and beta chain sequences with TCRBuilder2

python
from biolmai import BioLM
response = BioLM(
    entity="tcrbuilder2",
    action="predict",
    params={},
    items=[
      {
        "A": "AQSVTQPSHQVSLGQTVTLSCNYTSSDFQYWYRQNSGTLQLLLKYTAATLTKGINDFAAELKKSETSFHLTKPSAHMSDAAEYFCAVSEQDDKIIFGKGTRLHILP",
        "B": "ADVTQTPRNRITKTGKRIMLECSQTKGHDRMYWYRQDPGLGLRLIYYSFDVKDINKGEISDGYSVSRQAQAKFSLSLESAIPNQTALYFCATSDESYGYTFGSGTRLTVV"
      },
      {
        "A": "AQSVTQLGSHVSVSEGALVLLRCNYSSSVPPYLFWYVQYPNQGLQLLLKYTSAATLVKGINGFEAEFKKSETSFHLTKPSAHMSDAAEYFCAVQKNGQKLIFGKGTRLHILP",
        "B": "ADVTQTPRNLITKTGKRIMLQCSQTQGRDRMYWYRQDPGLGLRLIYYSLDVKDINKGEISDGYSVSRQAQAKFSLSLDSAIPNQTALYFCASSYLGSGNTGQLYYGYTFGSGTRLTVV"
      }
    ]
)
print(response)
bash
curl -X POST https://biolm.ai/api/v3/tcrbuilder2/predict/ \
  -H "Authorization: Token YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "items": [
    {
      "A": "AQSVTQPSHQVSLGQTVTLSCNYTSSDFQYWYRQNSGTLQLLLKYTAATLTKGINDFAAELKKSETSFHLTKPSAHMSDAAEYFCAVSEQDDKIIFGKGTRLHILP",
      "B": "ADVTQTPRNRITKTGKRIMLECSQTKGHDRMYWYRQDPGLGLRLIYYSFDVKDINKGEISDGYSVSRQAQAKFSLSLESAIPNQTALYFCATSDESYGYTFGSGTRLTVV"
    },
    {
      "A": "AQSVTQLGSHVSVSEGALVLLRCNYSSSVPPYLFWYVQYPNQGLQLLLKYTSAATLVKGINGFEAEFKKSETSFHLTKPSAHMSDAAEYFCAVQKNGQKLIFGKGTRLHILP",
      "B": "ADVTQTPRNLITKTGKRIMLQCSQTQGRDRMYWYRQDPGLGLRLIYYSLDVKDINKGEISDGYSVSRQAQAKFSLSLDSAIPNQTALYFCASSYLGSGNTGQLYYGYTFGSGTRLTVV"
    }
  ]
}'
python
import requests

url = "https://biolm.ai/api/v3/tcrbuilder2/predict/"
headers = {
    "Authorization": "Token YOUR_API_KEY",
    "Content-Type": "application/json"
}
payload = {
      "items": [
        {
          "A": "AQSVTQPSHQVSLGQTVTLSCNYTSSDFQYWYRQNSGTLQLLLKYTAATLTKGINDFAAELKKSETSFHLTKPSAHMSDAAEYFCAVSEQDDKIIFGKGTRLHILP",
          "B": "ADVTQTPRNRITKTGKRIMLECSQTKGHDRMYWYRQDPGLGLRLIYYSFDVKDINKGEISDGYSVSRQAQAKFSLSLESAIPNQTALYFCATSDESYGYTFGSGTRLTVV"
        },
        {
          "A": "AQSVTQLGSHVSVSEGALVLLRCNYSSSVPPYLFWYVQYPNQGLQLLLKYTSAATLVKGINGFEAEFKKSETSFHLTKPSAHMSDAAEYFCAVQKNGQKLIFGKGTRLHILP",
          "B": "ADVTQTPRNLITKTGKRIMLQCSQTQGRDRMYWYRQDPGLGLRLIYYSLDVKDINKGEISDGYSVSRQAQAKFSLSLDSAIPNQTALYFCASSYLGSGNTGQLYYGYTFGSGTRLTVV"
        }
      ]
    }

response = requests.post(url, headers=headers, json=payload)
print(response.json())
r
library(httr)

url <- "https://biolm.ai/api/v3/tcrbuilder2/predict/"
headers <- c("Authorization" = "Token YOUR_API_KEY", "Content-Type" = "application/json")
body <- list(
  items = list(
    list(
      A = "AQSVTQPSHQVSLGQTVTLSCNYTSSDFQYWYRQNSGTLQLLLKYTAATLTKGINDFAAELKKSETSFHLTKPSAHMSDAAEYFCAVSEQDDKIIFGKGTRLHILP",
      B = "ADVTQTPRNRITKTGKRIMLECSQTKGHDRMYWYRQDPGLGLRLIYYSFDVKDINKGEISDGYSVSRQAQAKFSLSLESAIPNQTALYFCATSDESYGYTFGSGTRLTVV"
    ),
    list(
      A = "AQSVTQLGSHVSVSEGALVLLRCNYSSSVPPYLFWYVQYPNQGLQLLLKYTSAATLVKGINGFEAEFKKSETSFHLTKPSAHMSDAAEYFCAVQKNGQKLIFGKGTRLHILP",
      B = "ADVTQTPRNLITKTGKRIMLQCSQTQGRDRMYWYRQDPGLGLRLIYYSLDVKDINKGEISDGYSVSRQAQAKFSLSLDSAIPNQTALYFCASSYLGSGNTGQLYYGYTFGSGTRLTVV"
    )
  )
)

res <- POST(url, add_headers(.headers = headers), body = body, encode = "json")
print(content(res))
POST /api/v3/tcrbuilder2/predict/

Predict endpoint for TCRBuilder2.

Request Headers:

Request

  • params (object, optional) — Configuration parameters:

    • batch_size (int, default: 8) — Number of sequences processed per batch (maximum: 8)

    • max_sequence_len (int, default: 2048) — Maximum allowed length of input sequences (maximum: 2048)

  • items (array of objects, min: 1, max: 8) — Input sequences:

    • H (string, optional, min length: 1, max length: 2048) — Amino acid sequence of antibody heavy chain or nanobody

    • L (string, optional, min length: 1, max length: 2048) — Amino acid sequence of antibody light chain

    • A (string, optional, min length: 1, max length: 2048) — Amino acid sequence of T-cell receptor alpha chain

    • B (string, optional, min length: 1, max length: 2048) — Amino acid sequence of T-cell receptor beta chain

    Valid combinations of fields within each item (exactly one combination required):

    • H and L provided — Antibody (ABodyBuilder2)

    • H provided alone — Nanobody (NanoBodyBuilder2)

    • A and B provided — T-cell receptor (TCRBuilder2, TCRBuilder2PLUS)

  • items (array of objects, min: 1, max: 8) — Input sequences for NanoBodyBuilder2:

    • H (string, required, min length: 1, max length: 2048) — Amino acid sequence of nanobody

  • items (array of objects, min: 1, max: 8) — Input sequences for ABodyBuilder2:

    • H (string, required, min length: 1, max length: 2048) — Amino acid sequence of antibody heavy chain

    • L (string, required, min length: 1, max length: 2048) — Amino acid sequence of antibody light chain

  • items (array of objects, min: 1, max: 8) — Input sequences for TCRBuilder2:

    • A (string, required, min length: 1, max length: 2048) — Amino acid sequence of T-cell receptor alpha chain

    • B (string, required, min length: 1, max length: 2048) — Amino acid sequence of T-cell receptor beta chain

Example request:

http
POST /api/v3/tcrbuilder2/predict/ HTTP/1.1
Host: biolm.ai
Authorization: Token YOUR_API_KEY
Content-Type: application/json

      {
  "items": [
    {
      "A": "AQSVTQPSHQVSLGQTVTLSCNYTSSDFQYWYRQNSGTLQLLLKYTAATLTKGINDFAAELKKSETSFHLTKPSAHMSDAAEYFCAVSEQDDKIIFGKGTRLHILP",
      "B": "ADVTQTPRNRITKTGKRIMLECSQTKGHDRMYWYRQDPGLGLRLIYYSFDVKDINKGEISDGYSVSRQAQAKFSLSLESAIPNQTALYFCATSDESYGYTFGSGTRLTVV"
    },
    {
      "A": "AQSVTQLGSHVSVSEGALVLLRCNYSSSVPPYLFWYVQYPNQGLQLLLKYTSAATLVKGINGFEAEFKKSETSFHLTKPSAHMSDAAEYFCAVQKNGQKLIFGKGTRLHILP",
      "B": "ADVTQTPRNLITKTGKRIMLQCSQTQGRDRMYWYRQDPGLGLRLIYYSLDVKDINKGEISDGYSVSRQAQAKFSLSLDSAIPNQTALYFCASSYLGSGNTGQLYYGYTFGSGTRLTVV"
    }
  ]
}
Status Codes:

Response

  • results (array of objects) — One result per input item, in the order requested:

    • pdb (string) — Predicted immune protein structure in standard PDB format; includes atomic coordinates for backbone and side-chain atoms; structure generated by ImmuneBuilder deep-learning models (ABodyBuilder2, NanoBodyBuilder2, or TCRBuilder2); accuracy comparable to AlphaFold-Multimer; typical backbone RMSD for CDR loops ranges from approximately 0.4Å to 3.5Å (depending on loop type and model variant); stereochemically refined to remove clashes, cis-peptide bonds, and nonphysical bond lengths; coordinates provided in Angstroms (Å)

Example response:

http
HTTP/1.1 200 OK
Content-Type: application/json

      {
  "error": true,
  "status_code": 500,
  "message": "{\"error\":\"Uncaught exception: Mismatch detected: expected 'nanobodybuilder2' but got 'tcrbuilder2' and  'tcrbuilder2plus' in request\",\"status_code\":500}"
}

Performance

  • TCRBuilder2 is a specialized deep-learning model optimized for rapid and accurate prediction of T-cell receptor (TCR) structures from amino acid sequences.

  • Predictive accuracy for TCRBuilder2 significantly surpasses general-purpose protein structure predictors such as AlphaFold-Multimer, especially for TCR-specific complementarity-determining regions (CDRs):

    • TCRBuilder2 achieves mean backbone RMSD values of 1.85 Å and 1.93 Å for the challenging CDR-α3 and CDR-β3 loops, respectively, closely matching AlphaFold-Multimer (1.84 Å and 1.94 Å), but with dramatically faster inference speeds.

    • Compared to the previous-generation TCRBuilder, TCRBuilder2 reduces backbone RMSD error by approximately 1.0 Å across critical CDR loops, demonstrating substantial accuracy improvements.

  • TCRBuilder2 predictions are computationally lightweight and highly optimized, enabling GPU-accelerated inference:

    • Typical inference time is approximately 5 seconds per TCR structure prediction on a single NVIDIA Tesla P100 GPU, making it over 100 times faster than AlphaFold-Multimer, which typically requires ~30 minutes per structure even with GPU acceleration and optimized sequence alignment (e.g., Mmseqs2).

  • Unlike AlphaFold-Multimer, TCRBuilder2 does not require extensive multiple-sequence alignment (MSA) databases, significantly reducing computational overhead and storage requirements.

  • TCRBuilder2 generates physically plausible structures with negligible stereochemical violations, comparable to experimentally determined crystal structures:

    • Structural refinement via restrained energy minimization ensures accurate bond lengths, angles, and chirality, resulting in zero observed cis-peptide bonds, D-amino acids, or steric clashes.

  • TCRBuilder2 outputs are provided in standard Protein Data Bank (PDB) format, facilitating immediate downstream analysis and integration into structural bioinformatics pipelines.

  • BioLM’s optimized deployment ensures consistent, scalable GPU-based inference, enabling high-throughput TCR structural predictions suitable for large-scale analysis of next-generation sequencing datasets.

Applications

  • Predicting T-cell receptor (TCR) structures for engineered T-cell therapies, enabling rapid screening and optimization of TCR candidates for improved antigen specificity and binding affinity; valuable for biotech companies developing personalized cancer immunotherapies but not optimal for modeling non-immune proteins.

  • Structural modeling of TCR-antigen interactions to identify key residues involved in binding, enabling rational design and affinity maturation of therapeutic TCRs; useful for biotech firms aiming to improve efficacy and reduce off-target effects, though predictions may require experimental validation for highly flexible binding sites.

  • High-throughput structural analysis of TCR repertoires from next-generation sequencing data, enabling biotech researchers to rapidly identify structurally distinct TCR clusters associated with antigen specificity or disease states; valuable for biomarker discovery and immune profiling, but not intended for modeling full-length membrane-bound TCR complexes.

  • Computational assessment of TCR stability and manufacturability by predicting structural liabilities such as aggregation-prone regions or unstable loops, enabling early-stage filtering of therapeutic TCR candidates; beneficial for companies aiming to streamline development pipelines, though not a substitute for experimental stability assays.

  • Generation of accurate TCR structural ensembles to estimate prediction uncertainty, enabling researchers to prioritize high-confidence candidates for downstream experimental characterization; particularly useful in industrial settings where experimental validation resources are limited, though ensemble predictions may underestimate flexibility of highly dynamic regions.

Limitations

  • Maximum Sequence Length: Input sequences for TCRBuilder2 are limited to 2048 amino acids per chain (A and B). Longer sequences must be truncated or split before submission.

  • Batch Size: Each API request can contain a maximum of 8 sequence pairs. For larger datasets, divide your data into batches of 8 or fewer items per request.

  • TCRBuilder2 is specifically optimized for T-cell receptor (TCR) structures. It will not accurately predict antibody or nanobody structures. For these, use abodybuilder2 or nanobodybuilder2 models instead.

  • While TCRBuilder2 provides high accuracy comparable to AlphaFold-Multimer for TCR structures, it may perform worse for non-standard or heavily mutated sequences that differ significantly from typical TCR training data.

  • TCRBuilder2 generates a single predicted structure per input sequence pair. If exploring multiple alternative conformations is critical for your use case, consider using AlphaFold-Multimer despite its higher computational cost.

  • The model returns a PDB-formatted structure (pdb). It does not provide embeddings, per-residue confidence scores, or residue-level uncertainty metrics. If these features are required, you may need to utilize alternative models or additional downstream analysis tools.

How We Use It

BioLM integrates TCRBuilder2 into protein engineering workflows to rapidly predict accurate structural models of T-cell receptors (TCRs), enabling efficient screening and selection of therapeutically promising TCR candidates. By combining TCRBuilder2’s fast, high-accuracy predictions with BioLM’s predictive modeling, generative AI, and biophysical property assessments, research teams can quickly identify and prioritize TCR designs with optimal binding characteristics, stability, and manufacturability.

  • Accelerates iterative cycles of TCR optimization by rapidly generating structural insights.

  • Integrates seamlessly with downstream predictive models and biophysical screening tools to streamline therapeutic candidate selection.

References