TemBERTure Classifier (TemBERTureCLS) predicts protein thermostability class (thermophilic >60°C vs non-thermophilic) from primary sequence using a protBERT-BFD backbone with Pfeiffer adapter fine-tuning. The service accepts FASTA input, tokenizes at the amino acid level, truncates to 512 residues, and returns class labels with probabilities; optional per-residue attention scores support interpretation. Trained on the curated TemBERTureDB, it achieves ~0.89 accuracy, F1 0.90, MCC 0.78. GPU-accelerated, batch inference enables triage for enzyme engineering, library pruning, and metagenome annotation.

Predict

Predict properties or scores for input sequences

python
from biolmai import BioLM
response = BioLM(
    entity="temberture-classifier",
    action="predict",
    params={},
    items=[
      {
        "sequence": "MKGSILGFVFGDE"
      },
      {
        "sequence": "ASTTSIHR-GGKP"
      }
    ]
)
print(response)
bash
curl -X POST https://biolm.ai/api/v3/temberture-classifier/predict/ \
  -H "Authorization: Token YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "items": [
    {
      "sequence": "MKGSILGFVFGDE"
    },
    {
      "sequence": "ASTTSIHR-GGKP"
    }
  ]
}'
python
import requests

url = "https://biolm.ai/api/v3/temberture-classifier/predict/"
headers = {
    "Authorization": "Token YOUR_API_KEY",
    "Content-Type": "application/json"
}
payload = {
      "items": [
        {
          "sequence": "MKGSILGFVFGDE"
        },
        {
          "sequence": "ASTTSIHR-GGKP"
        }
      ]
    }

response = requests.post(url, headers=headers, json=payload)
print(response.json())
r
library(httr)

url <- "https://biolm.ai/api/v3/temberture-classifier/predict/"
headers <- c("Authorization" = "Token YOUR_API_KEY", "Content-Type" = "application/json")
body <- list(
  items = list(
    list(
      sequence = "MKGSILGFVFGDE"
    ),
    list(
      sequence = "ASTTSIHR-GGKP"
    )
  )
)

res <- POST(url, add_headers(.headers = headers), body = body, encode = "json")
print(content(res))
POST /api/v3/temberture-classifier/predict/

Predict endpoint for TemBERTure Classifier.

Request Headers:

Request

  • items (array of objects, min length: 1, max length: 8) — Input sequences

    • sequence (string, min length: 1, max length: 512, required) — Protein sequence with extended amino acid codes plus “-”

Example request:

http
POST /api/v3/temberture-classifier/predict/ HTTP/1.1
Host: biolm.ai
Authorization: Token YOUR_API_KEY
Content-Type: application/json

      {
  "items": [
    {
      "sequence": "MKGSILGFVFGDE"
    },
    {
      "sequence": "ASTTSIHR-GGKP"
    }
  ]
}
Status Codes:

Response

  • results (array of objects) — One result per input item, in the order requested:

    • prediction (float, range: 0.0–1.0 for classifier or approximately 20.0–110.0 °C for regression) — Model output score or predicted melting temperature

    • classification (string, optional) — Predicted protein thermal class (e.g. “thermophilic” or “non-thermophilic”)

Example response:

http
HTTP/1.1 200 OK
Content-Type: application/json

      {
  "results": [
    {
      "prediction": 0.44348000620737454,
      "classification": "Non-thermophilic"
    },
    {
      "prediction": 0.1699508151440416,
      "classification": "Non-thermophilic"
    }
  ]
}

Encode

Generate embeddings for input sequences

python
from biolmai import BioLM
response = BioLM(
    entity="temberture-classifier",
    action="encode",
    params={
      "include": [
        "mean",
        "per_residue"
      ]
    },
    items=[
      {
        "sequence": "MKVALGAIFVDK"
      },
      {
        "sequence": "GGAKKLY-PQMV"
      }
    ]
)
print(response)
bash
curl -X POST https://biolm.ai/api/v3/temberture-classifier/encode/ \
  -H "Authorization: Token YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "params": {
    "include": [
      "mean",
      "per_residue"
    ]
  },
  "items": [
    {
      "sequence": "MKVALGAIFVDK"
    },
    {
      "sequence": "GGAKKLY-PQMV"
    }
  ]
}'
python
import requests

url = "https://biolm.ai/api/v3/temberture-classifier/encode/"
headers = {
    "Authorization": "Token YOUR_API_KEY",
    "Content-Type": "application/json"
}
payload = {
      "params": {
        "include": [
          "mean",
          "per_residue"
        ]
      },
      "items": [
        {
          "sequence": "MKVALGAIFVDK"
        },
        {
          "sequence": "GGAKKLY-PQMV"
        }
      ]
    }

response = requests.post(url, headers=headers, json=payload)
print(response.json())
r
library(httr)

url <- "https://biolm.ai/api/v3/temberture-classifier/encode/"
headers <- c("Authorization" = "Token YOUR_API_KEY", "Content-Type" = "application/json")
body <- list(
  params = list(
    include = list(
      "mean",
      "per_residue"
    )
  ),
  items = list(
    list(
      sequence = "MKVALGAIFVDK"
    ),
    list(
      sequence = "GGAKKLY-PQMV"
    )
  )
)

res <- POST(url, add_headers(.headers = headers), body = body, encode = "json")
print(content(res))
POST /api/v3/temberture-classifier/encode/

Encode endpoint for TemBERTure Classifier.

Request Headers:

Request

  • params (object, optional) — Configuration parameters:

    • include (array of strings, default: [“mean”]) — Types of embeddings to include (possible values: “mean”, “per_residue”, “cls”)

  • items (array of objects, min length: 1, max length: 8) — Input sequences:

    • sequence (string, min length: 1, max length: 512, required) — Protein sequence using extended amino acid codes plus “-”

Example request:

http
POST /api/v3/temberture-classifier/encode/ HTTP/1.1
Host: biolm.ai
Authorization: Token YOUR_API_KEY
Content-Type: application/json

      {
  "params": {
    "include": [
      "mean",
      "per_residue"
    ]
  },
  "items": [
    {
      "sequence": "MKVALGAIFVDK"
    },
    {
      "sequence": "GGAKKLY-PQMV"
    }
  ]
}
Status Codes:

Response

  • results (array of objects) — One result per input item, in the order requested:

    • sequence_index (int) — Zero-based index of the input sequence

    • embeddings (array of float, shape: 1024, optional) — Mean protein embeddings

    • per_residue_embeddings (array of arrays of float, shape: [L, 1024], optional) — Per-residue embeddings (L ≤ 512)

    • cls_embeddings (array of float, shape: 1024, optional) — CLS token embedding

Example response:

http
HTTP/1.1 200 OK
Content-Type: application/json

      {
  "results": [
    {
      "sequence_index": 0,
      "embeddings": [
        0.0038513324689120054,
        0.03376583755016327,
        "... (truncated for documentation)"
      ],
      "per_residue_embeddings": [
        [
          0.048194464296102524,
          0.019800299778580666,
          "... (truncated for documentation)"
        ],
        [
          0.1478080451488495,
          -0.0944250226020813,
          "... (truncated for documentation)"
        ]
      ]
    },
    {
      "sequence_index": 1,
      "embeddings": [
        0.0038513324689120054,
        0.03376583755016327,
        "... (truncated for documentation)"
      ],
      "per_residue_embeddings": [
        [
          0.048194464296102524,
          0.019800299778580666,
          "... (truncated for documentation)"
        ],
        [
          0.1478080451488495,
          -0.0944250226020813,
          "... (truncated for documentation)"
        ]
      ]
    }
  ]
}

Performance

  • Input and output types - Input: amino acid sequence(s) as strings; extended protein alphabet supported with gap character ‘-’ permitted - Output (per sequence): prediction (float in [0.0–1.0], higher means more thermophilic) and classification (string: “thermophilic” or “non-thermophilic”)

  • Hardware and runtime characteristics - Deployed on NVIDIA H100 and A100 for high-throughput inference, with L4 used for cost-optimized throughput; mixed-precision (FP16/BF16) inference with fused attention kernels - Typical end-to-end tokenizer + model compute share: >95% GPU-bound, <5% CPU tokenization overhead for 512-aa inputs - Throughput at 512 aa (dynamic batching enabled): H100 ~1,600–2,100 sequences/min; A100 ~900–1,300 sequences/min; L4 ~400–650 sequences/min - Memory footprint at 512 aa (single process, adapters active): ~1–3 GB GPU memory; adapters add negligible overhead at inference relative to the base model

  • Model architecture and complexity - Based on protBERT-BFD (~420M parameters, 30 transformer layers, 16 heads, 1024 hidden); O(L²) attention cost with sequence length L - Adapter-based fine-tuning (Pfeiffer adapters) reduces train-time parameters from ~420M to ~5M; inference cost remains comparable to full fine-tuning with negligible adapter overhead

  • Predictive performance (TemBERTure Classifier) - In-domain test set (TemBERTureDB, clustered split to prevent leakage): accuracy 0.89, F1 0.90, MCC 0.78; balanced class F1 (non-thermophilic 0.88, thermophilic 0.90); low run-to-run variance across seeds - Cross-dataset generalization after 50% identity filtering: accuracy ~0.86 (iThermo) and ~0.83 (TemStaPro test subset); precision remains high for non-thermophiles with moderate drop for thermophiles on unseen organisms - Identity-stratified performance shows stable non-thermophilic classification across identity bins; thermophilic performance degrades primarily below 20% identity, consistent with harder out-of-distribution sequences

  • Comparative performance within BioLM’s model family - Versus TemBERTure Regression used as a classifier (70°C threshold): classifier achieves higher accuracy (0.89 vs ~0.82) and markedly better thermophile recall with lower variance across random seeds; regression exhibits bimodal bias around class boundaries when used for Tm prediction - Versus ESM-2 650M + lightweight classifier head (internal benchmark on matched splits): TemBERTure Classifier provides comparable or better class balance and thermophile recall while running ~1.4–1.8× faster per 512-aa sequence on A100-class GPUs due to the smaller base model and optimized kernels; ESM-2 150M is faster but shows lower thermophile recall and MCC on the same benchmarks

  • Operational optimizations for scale - Dynamic request coalescing and sequence-length bucketing to maximize GPU occupancy with minimal queuing overhead - Kernel-fused attention and mixed-precision execution to reduce latency without measurable loss in classification accuracy (logit deltas vs FP32 within numerical noise) - Horizontal autoscaling across GPU pools to sustain high sustained throughput; cold-start amortization via warm pools and weight preloading

  • Practical guidance - Scores are well-calibrated for a default threshold of 0.5 on in-domain data; for cross-organism scenarios, users targeting thermophile enrichment should consider slightly lower thresholds to maximize recall while monitoring precision - For long proteins, inference cost grows quadratically with length; batching proteins of similar lengths yields the best device utilization and end-to-end throughput

Applications

  • High-throughput triage of enzyme variant libraries for hot-process biocatalysis: use TemBERTureCLS to rank sequences by thermophilic class score before expression, reducing wet-lab screening by focusing on variants more likely to remain active at ≥60°C; example uses include cellulases for biomass saccharification, amylases for starch liquefaction, and lipases/esterases for solvent-rich reactions; limitations: binary class (thermophilic vs non-thermophilic) rather than exact Tm, sequences longer than 512 aa are truncated by the underlying model, treat the score as an enrichment prior rather than a final release criterion

  • Genome and metagenome mining for thermostable homolog discovery: apply the classifier across UniProt/NCBI/metagenome assemblies to prioritize candidates predicted thermophilic, accelerating hit-finding for high-temperature reactors and solvent-tolerant processes; example uses include selecting DNA/RNA polymerase or transaminase homologs for 65–80°C workflows; limitations: training data is enriched for bacterial/archaeal proteomes and organism growth temperatures, predictions for eukaryotic secreted proteins or extreme membrane proteins may be less reliable, always confirm experimentally

  • Design-loop guidance in enzyme engineering pipelines: integrate the class score as a lightweight objective to bias ML-guided design, recombination, or directed evolution toward variants more likely to be thermostable while filtering out destabilizing proposals early; example uses include narrowing combinatorial libraries for oxidoreductases or hydrolases prior to structural modeling/MD and wet-lab rounds; limitations: not calibrated for fine-grained ΔTm at single-mutation resolution, combine with structure/biophysics filters and experimental counterscreens

  • Process fit and host/process selection: rapidly assess whether a biocatalyst family is compatible with thermophilic process conditions, or select homologs predicted thermophilic for expression in high-temperature hosts or reactors; example uses include choosing heat-tolerant dehydrogenases for continuous flow at 70°C or proteases for high-temperature detergent formulations; limitations: the model does not account for buffer composition, cofactors/metals, pH, or formulation excipients, use as one input alongside stability assays

  • Pre-synthesis QC for construct design and fusion architectures: screen designed constructs (tags, linkers, domain swaps) to flag sequences likely to be non-thermophilic when the application requires heat robustness, reducing wasted DNA synthesis and expression runs; example uses include selecting truncation boundaries for thermostable catalytic domains or choosing linkers for thermostable fusions intended for hot reactors; limitations: sequences are evaluated up to 512 residues (truncation applied), chimeric/fusion behavior depends on context beyond primary sequence so treat results as triage signals

Limitations

  • API limits and request shape: Maximum Sequence Length 512 amino acids and Batch Size 8. Requests with items longer than 512 or more than 8 sequences are rejected; long proteins are not auto-truncated or tiled—split multi-domain constructs yourself and aggregate decisions downstream. Only raw one-line sequences are accepted (no FASTA headers or whitespace).

  • Input alphabet and formatting: Each items element must contain a sequence using the standard amino-acid alphabet; extended tokens and - are accepted. Sequences with many ambiguous or non-standard tokens may validate but can degrade prediction quality.

  • Output semantics (classifier): Each result returns a scalar prediction and, when available, a classification label (thermophilic or non-thermophilic). The prediction is not probability-calibrated; set decision thresholds appropriate for your dataset and objectives. If you need vector representations for downstream calibration or ranking, use the encoding endpoint with include options: mean (sequence-level embedding), per_residue (per-position embeddings), or cls (CLS token embedding).

  • Scientific scope: The classifier predicts a coarse thermophilicity class derived primarily from organism growth temperature (>60°C vs. <30°C) and curated Meltome/BacDive labels; it does not estimate absolute melting temperature (Tm), mutation effects (ΔΔG/ΔTm), or context-specific stability (pH, salts, cofactors, ligands, membranes).

  • Generalization and dataset bias: Performance can drop under domain shift—e.g., sequences from unseen taxa or very low similarity (<20% identity) to training data—especially for the thermophilic class. Training data are enriched for bacterial/archaeal proteins; coverage of eukaryotic, viral, antibody, and highly engineered proteins is limited, so caution is advised.

  • When not optimal: Use cases needing exact Tm ranking, fine-grained mutation scanning, or structure-aware assessment are better served with complementary tools (e.g., regression plus calibration, ΔΔG predictors, or structure models). For early-stage triage of very large libraries, consider faster heuristics or embeddings (include = mean) for pre-filtering, then apply the classifier on narrowed sets.

How We Use It

TemBERTure Classifier enables rapid, sequence-only assessment of thermophilic class and is embedded as a decision layer across BioLM protein design and optimization workflows. We use its calibrated score to triage large variant libraries from masked language models and evolutionary sampling, gate temperature-aware regression ensembles, and inform assay design (e.g., screening temperatures and host systems). Combined with structure-derived metrics (AlphaFold2 models, interface packing, Rosetta ΔΔG) and physicochemical features (charge, pI, hydrophobicity), the classifier accelerates downselection and focuses wet-lab effort on variants most likely to meet process temperature targets. Attention-derived residue saliency helps prioritize mutational hot spots and stability motifs for targeted diversification, improving iteration speed in active-learning campaigns. Standardized APIs support high-throughput batch scoring and consistent feature logging into multi-objective ranking models used for enzyme design, antibody maturation, and developability risk reduction.

  • Upstream filter in generative loops to enforce thermostability constraints and raise hit quality before synthesis.

  • Routing signal for class-specific models (e.g., TemBERTureTm ensembles, solubility/aggregation predictors) and DOE planning at relevant temperature regimes.

  • Feature in multi-objective optimization alongside activity and expression, reducing experimental cycles to reach required Topt/Tm bands.

References