Sadie Antibody, Sequencing Analysis and Data library for Immunoinformatics Exploration (SADIE), provides AIRR-standard compliant annotation, accurately identifying CDRs, framework regions, somatic hypermutation rates, and V(D)J segment usage directly from raw antibody sequence data. The API outputs structured AIRR tables, facilitating downstream clustering, lineage analysis, and antibody engineering workflows. Typical use cases include antibody discovery, repertoire analysis, and bioinformatics pipelines for therapeutic antibody optimization.

Predict

Predict properties or scores for input sequences

python
from biolmai import BioLM
response = BioLM(
    entity="sadie-antibody",
    action="predict",
    params={
      "region_assign": "kabat",
      "scheme": "chothia",
      "scfv": false,
      "allowed_chain": [
        "H",
        "L"
      ]
    },
    items=[
      {
        "sequence": "QVQLVQSGAEVKKPGASVKVSCKASGYTFTSYAMHWVRQAPGQGLEWMGWINAGNGNTKYSQKFQGRVTITRDTSASTAYMELSSLRSEDTAVYYCAKVSYLSTASSLDYWGQGTLVTVSSASTKGPSVFPLAPSSKSTSGGTAALGCLVKDYFPEPVTVSWNSGALTSGVHTFPAVLQSSGLYSLSSVVTVPSSSLGTQTYICNVNHKPSNTKVDKKVEPKSCDKTHTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLTVLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQPREPQVYTLPPSRDELTKNQVSLTCLVKGFYPSDIAVEWESNGQPENNYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHEALHNHYTQKSLSLSPGK"
      }
    ]
)
print(response)
bash
curl -X POST https://biolm.ai/api/v3/sadie-antibody/predict/ \
  -H "Authorization: Token YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "params": {
    "region_assign": "kabat",
    "scheme": "chothia",
    "scfv": false,
    "allowed_chain": [
      "H",
      "L"
    ]
  },
  "items": [
    {
      "sequence": "QVQLVQSGAEVKKPGASVKVSCKASGYTFTSYAMHWVRQAPGQGLEWMGWINAGNGNTKYSQKFQGRVTITRDTSASTAYMELSSLRSEDTAVYYCAKVSYLSTASSLDYWGQGTLVTVSSASTKGPSVFPLAPSSKSTSGGTAALGCLVKDYFPEPVTVSWNSGALTSGVHTFPAVLQSSGLYSLSSVVTVPSSSLGTQTYICNVNHKPSNTKVDKKVEPKSCDKTHTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLTVLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQPREPQVYTLPPSRDELTKNQVSLTCLVKGFYPSDIAVEWESNGQPENNYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHEALHNHYTQKSLSLSPGK"
    }
  ]
}'
python
import requests

url = "https://biolm.ai/api/v3/sadie-antibody/predict/"
headers = {
    "Authorization": "Token YOUR_API_KEY",
    "Content-Type": "application/json"
}
payload = {
      "params": {
        "region_assign": "kabat",
        "scheme": "chothia",
        "scfv": false,
        "allowed_chain": [
          "H",
          "L"
        ]
      },
      "items": [
        {
          "sequence": "QVQLVQSGAEVKKPGASVKVSCKASGYTFTSYAMHWVRQAPGQGLEWMGWINAGNGNTKYSQKFQGRVTITRDTSASTAYMELSSLRSEDTAVYYCAKVSYLSTASSLDYWGQGTLVTVSSASTKGPSVFPLAPSSKSTSGGTAALGCLVKDYFPEPVTVSWNSGALTSGVHTFPAVLQSSGLYSLSSVVTVPSSSLGTQTYICNVNHKPSNTKVDKKVEPKSCDKTHTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLTVLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQPREPQVYTLPPSRDELTKNQVSLTCLVKGFYPSDIAVEWESNGQPENNYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHEALHNHYTQKSLSLSPGK"
        }
      ]
    }

response = requests.post(url, headers=headers, json=payload)
print(response.json())
r
library(httr)

url <- "https://biolm.ai/api/v3/sadie-antibody/predict/"
headers <- c("Authorization" = "Token YOUR_API_KEY", "Content-Type" = "application/json")
body <- list(
  params = list(
    region_assign = "kabat",
    scheme = "chothia",
    scfv = FALSE,
    allowed_chain = list(
      "H",
      "L"
    )
  ),
  items = list(
    list(
      sequence = "QVQLVQSGAEVKKPGASVKVSCKASGYTFTSYAMHWVRQAPGQGLEWMGWINAGNGNTKYSQKFQGRVTITRDTSASTAYMELSSLRSEDTAVYYCAKVSYLSTASSLDYWGQGTLVTVSSASTKGPSVFPLAPSSKSTSGGTAALGCLVKDYFPEPVTVSWNSGALTSGVHTFPAVLQSSGLYSLSSVVTVPSSSLGTQTYICNVNHKPSNTKVDKKVEPKSCDKTHTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLTVLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQPREPQVYTLPPSRDELTKNQVSLTCLVKGFYPSDIAVEWESNGQPENNYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHEALHNHYTQKSLSLSPGK"
    )
  )
)

res <- POST(url, add_headers(.headers = headers), body = body, encode = "json")
print(content(res))
POST /api/v3/sadie-antibody/predict/

Predict endpoint for Sadie Antibody.

Request Headers:

Request

  • params (object, required) — Configuration parameters:

    • region_assign (enum: [imgt, kabat, chothia, abm, contact, scdr], default: “imgt”, optional) — Region definition

    • scheme (enum: [imgt, kabat, chothia], default: “chothia”, optional) — Numbering scheme

    • scfv (boolean, default: false) — Whether to allow single-chain Fv

    • allowed_chain (array of strings, default: [“H”, “K”, “L”]) — Must be a subset of [L, H, K, A, B, G, D]

  • items (array of objects, min: 1, max: 8, required) — Input sequences:

    • sequence (string, min length: 1, max length: 2048, required) — Protein sequence with extended amino acid codes

Example request:

http
POST /api/v3/sadie-antibody/predict/ HTTP/1.1
Host: biolm.ai
Authorization: Token YOUR_API_KEY
Content-Type: application/json

      {
  "params": {
    "region_assign": "kabat",
    "scheme": "chothia",
    "scfv": false,
    "allowed_chain": [
      "H",
      "L"
    ]
  },
  "items": [
    {
      "sequence": "QVQLVQSGAEVKKPGASVKVSCKASGYTFTSYAMHWVRQAPGQGLEWMGWINAGNGNTKYSQKFQGRVTITRDTSASTAYMELSSLRSEDTAVYYCAKVSYLSTASSLDYWGQGTLVTVSSASTKGPSVFPLAPSSKSTSGGTAALGCLVKDYFPEPVTVSWNSGALTSGVHTFPAVLQSSGLYSLSSVVTVPSSSLGTQTYICNVNHKPSNTKVDKKVEPKSCDKTHTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLTVLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQPREPQVYTLPPSRDELTKNQVSLTCLVKGFYPSDIAVEWESNGQPENNYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHEALHNHYTQKSLSLSPGK"
    }
  ]
}
Status Codes:

Response

  • results (array of objects) — One result per input item, in the order requested:

    • domain_no (integer) — Numerical domain index

    • hmm_species (string) — Species name from HMM alignment

    • chain_type (string) — Single-letter chain identifier

    • e_value (float) — Statistical E-value

    • score (float) — Alignment score

    • identity_species (string) — Closest species by identity

    • v_gene (string) — Top V gene call

    • v_identity (float) — V gene percent identity

    • j_gene (string) — Top J gene call

    • j_identity (float) — J gene percent identity

    • Chain (string) — Chain label

    • Numbering (array of integers, length ≤ 2048) — Residue index mapping

    • Insertion (array of strings, length ≤ 2048) — Insertions at each residue index

    • scheme (string) — Numbering scheme (e.g. “chothia”, “kabat”, “imgt”)

    • region_definition (string) — Region definition (e.g. “imgt”, “kabat”, “chothia”, “abm”, “contact”, “scdr”)

    • fwr1_aa_gaps (string) — Amino acids of FWR1 segment with gaps

    • fwr1_aa_no_gaps (string) — Amino acids of FWR1 segment without gaps

    • cdr1_aa_gaps (string) — Amino acids of CDR1 segment with gaps

    • cdr1_aa_no_gaps (string) — Amino acids of CDR1 segment without gaps

    • fwr2_aa_gaps (string) — Amino acids of FWR2 segment with gaps

    • fwr2_aa_no_gaps (string) — Amino acids of FWR2 segment without gaps

    • cdr2_aa_gaps (string) — Amino acids of CDR2 segment with gaps

    • cdr2_aa_no_gaps (string) — Amino acids of CDR2 segment without gaps

    • fwr3_aa_gaps (string) — Amino acids of FWR3 segment with gaps

    • fwr3_aa_no_gaps (string) — Amino acids of FWR3 segment without gaps

    • cdr3_aa_gaps (string) — Amino acids of CDR3 segment with gaps

    • cdr3_aa_no_gaps (string) — Amino acids of CDR3 segment without gaps

    • fwr4_aa_gaps (string) — Amino acids of FWR4 segment with gaps

    • fwr4_aa_no_gaps (string) — Amino acids of FWR4 segment without gaps

    • leader (string) — Residues before alignment start

    • follow (string) — Residues after alignment end

Example response:

http
HTTP/1.1 200 OK
Content-Type: application/json

      {
  "results": [
    {
      "domain_no": 0,
      "hmm_species": "human",
      "chain_type": "H",
      "e_value": 0.0,
      "score": 184.0,
      "identity_species": "human",
      "v_gene": "IGHV1-3*01",
      "v_identity": 0.98,
      "j_gene": "IGHJ4*01",
      "j_identity": 0.93,
      "Chain": "H",
      "Numbering": [
        1,
        2,
        "... (truncated for documentation)"
      ],
      "Insertion": [
        "",
        "",
        "... (truncated for documentation)"
      ],
      "scheme": "chothia",
      "region_definition": "kabat",
      "fwr1_aa_gaps": "QVQLVQSGAEVKKPGASVKVSCKASGYTFT",
      "fwr1_aa_no_gaps": "QVQLVQSGAEVKKPGASVKVSCKASGYTFT",
      "cdr1_aa_gaps": "SYAMH",
      "cdr1_aa_no_gaps": "SYAMH",
      "fwr2_aa_gaps": "WVRQAPGQGLEWMG",
      "fwr2_aa_no_gaps": "WVRQAPGQGLEWMG",
      "cdr2_aa_gaps": "WINAGNGNTKYSQKFQG",
      "cdr2_aa_no_gaps": "WINAGNGNTKYSQKFQG",
      "fwr3_aa_gaps": "RVTITRDTSASTAYMELSSLRSEDTAVYYCAK",
      "fwr3_aa_no_gaps": "RVTITRDTSASTAYMELSSLRSEDTAVYYCAK",
      "cdr3_aa_gaps": "VSYLSTASSLDY",
      "cdr3_aa_no_gaps": "VSYLSTASSLDY",
      "fwr4_aa_gaps": "WGQGTLVTVSS",
      "fwr4_aa_no_gaps": "WGQGTLVTVSS",
      "leader": "",
      "follow": "ASTKGPSVFPLAPSSKSTSGGTAALGCLVKDYFPEPVTVSWNSGALTSGVHTFPAVLQSSGLYSLSSVVTVPSSSLGTQTYICNVNHKPSNTKVDKKVEPKSCDKTHTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLTVLHQDWLNGK... (truncated for documentation)"
    }
  ]
}

Performance

  • Sadie Antibody API provides high-throughput, GPU-accelerated antibody sequence numbering and annotation, suitable for large-scale immunoinformatics workflows.

  • Implements a Hidden Markov Model (HMM)-based approach for antibody numbering, significantly outperforming traditional BLAST-based annotation methods (such as IgBLAST) in terms of speed and scalability.

  • Typical runtime per antibody sequence is under 1 second, enabling rapid annotation of large antibody datasets.

  • Offers multiple numbering schemes (IMGT, Kabat, Chothia), with the Chothia scheme providing optimal balance between accuracy and computational efficiency.

  • Region definitions (IMGT, Kabat, Chothia, ABM, Contact, SCDR) are supported, allowing flexible annotation tailored to downstream analysis needs.

  • Compared to similar numbering and annotation algorithms (e.g., ANARCI), Sadie Antibody API achieves comparable or better accuracy with significantly improved computational efficiency due to GPU acceleration and optimized implementation.

  • Sadie Antibody API is particularly optimized for antibody engineering workflows, providing accurate identification of framework and CDR regions critical for antibody design, maturation, and optimization tasks.

  • Input type: amino acid sequences (single or batch); Output type: numbered antibody sequences with annotated framework and CDR regions, V/J gene assignments, and alignment metrics.

  • BioLM’s deployment of Sadie Antibody API leverages GPU acceleration and optimized software architecture, ensuring consistent high performance and scalability for large-scale antibody informatics pipelines.

Applications

  • Antibody sequence annotation for rapid identification of complementarity-determining regions (CDRs) and framework regions, enabling efficient antibody engineering and optimization by accurately mapping functional domains for affinity maturation or humanization workflows

  • Clustering antibody sequences based on CDR similarity to identify related antibody families, enabling streamlined selection of candidate antibodies for therapeutic development and optimization, though not optimal for datasets with highly divergent sequences

  • Standardized AIRR-compliant annotation output for antibody sequences, facilitating interoperability and data sharing between bioinformatics pipelines and immunoinformatics tools, essential for reproducible antibody discovery and characterization workflows

  • Antibody sequence numbering using common schemes (Chothia, Kabat, IMGT) to enable consistent residue-level comparisons across diverse antibody libraries, critical for accurate structural modeling, mutational analysis, and patent filings, but limited to standard antibody formats and not suitable for unconventional antibody-like scaffolds

  • Generation of structured, annotated antibody sequence objects (ReceptorChain) that encapsulate detailed annotations (e.g., germline assignments, CDR definitions, alignment scores), enabling efficient downstream computational analyses such as antibody repertoire profiling or machine learning-based antibody design, although not intended for direct structural prediction tasks

Limitations

  • Maximum Sequence Length: The maximum allowed sequence length is 2048 amino acids. Longer sequences must be truncated or split into smaller segments before submission.

  • Batch Size: The API supports a maximum batch size of 8 sequences per request. Larger datasets must be processed in multiple batches.

  • Supported Numbering Schemes: SADIE supports numbering schemes imgt, kabat, and chothia. Alternative numbering schemes are not supported.

  • Region Definitions: Region definitions are limited to imgt, kabat, chothia, abm, contact, and scdr. Custom or alternative region definitions cannot be used.

  • Chain Type Constraints: The API supports chains H, K, L, A, B, G, and D. Other chain types or non-standard antibody formats might not be optimally annotated.

  • Species and Germline Database: SADIE uses a predefined germline database primarily optimized for human and common model organisms. Custom species or unusual germline configurations may not be accurately annotated.

How We Use It

The Sadie Antibody algorithm enables BioLM to efficiently annotate, number, and cluster antibody sequences, streamlining antibody design and optimization workflows. By providing standardized antibody annotation consistent with AIRR guidelines, Sadie integrates seamlessly into our broader protein engineering pipelines, enhancing the accuracy and consistency of candidate selection. BioLM leverages Sadie Antibody to rapidly filter and rank antibody sequences based on precise CDR definitions, enabling accelerated antibody maturation cycles and reducing the time and cost associated with antibody discovery.

  • Integrates directly with BioLM’s predictive modeling and generative AI services, enabling end-to-end antibody optimization.

  • Accelerates research outcomes by quickly identifying lead candidates ready for synthesis and laboratory validation.

References