TemBERTure Classifier (TemBERTureCLS) predicts protein thermostability class (thermophilic >60°C vs non-thermophilic) from primary amino acid sequence using a protBERT-BFD backbone with Pfeiffer adapter fine-tuning. The API supports GPU-accelerated batch inference on up to 8 sequences of length ≤512 residues, returning a thermophilicity score in [0,1] and a discrete class label. Trained on the curated TemBERTureDB, the model reports ~0.89 accuracy, F1 0.90, MCC 0.78, enabling triage for enzyme engineering, library pruning, and metagenome annotation.
Predict¶
Predict properties or scores for input sequences
- POST /api/v3/temberture-classifier/predict/¶
Predict endpoint for TemBERTure Classifier.
- Request Headers:
Content-Type – application/json
Authorization – Token YOUR_API_KEY
Request
items (array of objects, min length: 1, max length: 8) — Input records:
sequence (string, min length: 1, max length: 512, required) — Protein sequence using extended amino acid codes plus “-”
Example request:
- Status Codes:
200 OK – Successful response
400 Bad Request – Invalid input
500 Internal Server Error – Internal server error
Response
results (array of objects) — One result per input item, in the order requested:
prediction (float) — Model output value; probability score in [0.0, 1.0] for classifier or predicted melting temperature in °C for regression
classification (string, optional) — Predicted protein thermal class label (e.g. “Thermophilic”, “Non-thermophilic”)
Example response:
Encode¶
Generate embeddings for input sequences
- POST /api/v3/temberture-classifier/encode/¶
Encode endpoint for TemBERTure Classifier.
- Request Headers:
Content-Type – application/json
Authorization – Token YOUR_API_KEY
Request
params (object, optional) — Configuration parameters:
include (array of strings, default: [“mean”]) — Embedding types to include in the response (possible values: “mean”, “per_residue”, “cls”)
items (array of objects, min length: 1, max length: 8) — Input sequences:
sequence (string, min length: 1, max length: 512, required) — Protein sequence using extended amino acid codes plus “-”
Example request:
- Status Codes:
200 OK – Successful response
400 Bad Request – Invalid input
500 Internal Server Error – Internal server error
Response
results (array of objects) — One result per input item, in the order requested:
sequence_index (int) — Zero-based index of the input sequence in the request
embeddings (array of float, size: 1024, optional) — Mean protein embedding vector for the sequence
per_residue_embeddings (array of arrays of float, shape: [L, 1024], optional) — Per-residue embedding vectors for the sequence (L ≤ 512)
cls_embeddings (array of float, size: 1024, optional) — CLS token embedding vector for the sequence
Example response:
Performance¶
Hardware and runtime characteristics - Deployed on NVIDIA H100, A100, and L4 GPUs for high-throughput mixed-precision (FP16/BF16) inference with fused attention kernels - For 512-aa sequences with dynamic batching, typical throughput is: H100 ~1,600–2,100 sequences/min; A100 ~900–1,300 sequences/min; L4 ~400–650 sequences/min - At 512 aa, a single process with adapters active uses ~1–3 GB GPU memory; adapter overhead at inference is negligible relative to the base protBERT-BFD model
Model architecture and computational cost - Based on protBERT-BFD (~420M parameters, 30 transformer layers, 16 heads, 1024 hidden size) with standard O(L²) attention in sequence length L - Adapter-based fine-tuning (Pfeiffer adapters) reduces train-time parameters to ~5M without materially changing inference latency compared to a fully fine-tuned 420M-parameter model
Predictive performance (TemBERTure Classifier) - In-domain (TemBERTureDB clustered split): accuracy 0.89, F1 0.90, MCC 0.78 with balanced class F1 (non-thermophilic 0.88, thermophilic 0.90) and low seed-to-seed variance - Cross-dataset after 50% identity filtering: accuracy ~0.86 on iThermo and ~0.83 on a TemStaPro-derived test subset; thermophile recall degrades mainly below 20% sequence identity, while non-thermophile performance is more stable
Comparative performance within BioLM’s model family - Versus TemBERTure regression used as a classifier (70°C threshold): the classifier achieves higher accuracy (0.89 vs ~0.82), substantially better thermophile recall, and lower variance; the regression model shows bimodal predictions around class boundaries when interpreted as Tm - Versus an internal ESM-2 650M + classifier head baseline on matched splits: TemBERTure Classifier offers comparable or better thermophile recall and overall class balance while running ~1.4–1.8× faster per 512-aa sequence on A100-class GPUs due to its smaller backbone and optimized kernels
Applications¶
High-throughput triage of enzyme variant libraries for hot-process biocatalysis: use TemBERTureCLS via the
predictorendpoint to rank sequences by thermophilic class score before expression, reducing wet-lab screening by focusing on variants more likely to remain active at ≥60°C; example uses include cellulases for biomass saccharification, amylases for starch liquefaction, and lipases/esterases for solvent-rich reactions; limitations: binary class (thermophilic vs non-thermophilic) rather than exact Tm, sequences must be ≤512 aa (longer inputs are rejected by the API), treat the score as an enrichment prior rather than a final release criterionGenome and metagenome mining for thermostable homolog discovery: apply the classifier score across UniProt/NCBI/metagenome assemblies to prioritize candidates predicted thermophilic, accelerating hit-finding for high-temperature reactors and solvent-tolerant processes; example uses include selecting DNA/RNA polymerase or transaminase homologs for 65–80°C workflows; limitations: training data is enriched for bacterial/archaeal proteomes and organism growth temperatures, predictions for eukaryotic secreted proteins or extreme membrane proteins may be less reliable, always confirm experimentally
Design-loop guidance in enzyme engineering pipelines: integrate the TemBERTureCLS class score as a lightweight objective to bias ML-guided design, recombination, or directed evolution toward variants more likely to be thermophilic while filtering out destabilizing proposals early; example uses include narrowing combinatorial libraries for oxidoreductases or hydrolases prior to structural modeling/MD and wet-lab rounds; limitations: not calibrated for fine-grained ΔTm at single-mutation resolution, and the regression model (TemBERTureTm) is not currently exposed in this API, so use class scores together with structure/biophysics filters and experimental counterscreens
Process fit and host/process selection: rapidly assess whether a biocatalyst family is compatible with thermophilic process conditions, or select homologs predicted thermophilic for expression in high-temperature hosts or reactors; example uses include choosing heat-tolerant dehydrogenases for continuous flow at 70°C or proteases for high-temperature detergent formulations; limitations: the model does not account for buffer composition, cofactors/metals, pH, or formulation excipients, use as one input alongside stability assays
Pre-synthesis QC for construct design and fusion architectures: screen designed constructs (tags, linkers, domain swaps) with TemBERTureCLS to flag sequences likely to be non-thermophilic when the application requires heat robustness, reducing wasted DNA synthesis and expression runs; example uses include selecting truncation boundaries for thermostable catalytic domains or choosing linkers for thermostable fusions intended for hot reactors; limitations: sequences must be ≤512 residues, and chimeric/fusion behavior depends on context beyond primary sequence so treat results as triage signals rather than definitive stability assessments
Limitations¶
Maximum Sequence Length is 512 amino acids and Batch Size is 8 per request. Any
itemsentry with asequencelonger than 512 or any request with more than 8 sequences is rejected. Long proteins are not auto-truncated, tiled, or split; you must segment multi-domain constructs yourself and combine results downstream. Only raw one-line sequences are accepted (no FASTA headers, newlines, or other whitespace).Input alphabet and formatting: Each
itemselement must include asequencecomposed of the standard amino-acid alphabet; extended tokens and-are accepted via theAAExtendedPlusExtravalidator. Sequences containing many ambiguous or non-standard symbols may still pass validation but can substantially degrade prediction quality.Output semantics (predictor): Each classifier call returns a scalar
predictionand, when available, aclassificationlabel (thermophilicornon-thermophilic). Thepredictionis a model score, not a probability-calibrated value; you should set application-specific thresholds and, if needed, perform your own calibration or ranking.Embeddings vs. classification: The
encoderendpoint does not perform classification or Tm prediction. It returns sequence-level and/or position-level embeddings depending onparams.include(mean,per_residue,cls). Use these vectors with your own downstream models if you need custom scoring, calibration, clustering, or retrieval beyond the built-in classifier.Scientific scope and dataset bias: TemBERTureCLS predicts a coarse thermophilicity class primarily derived from organism growth temperature (>60°C vs. <30°C) and curated Meltome/BacDive labels. It does not provide absolute melting temperature (Tm), mutation effects (ΔΔG/ΔTm), or environment-specific stability (pH, buffer, ligands, membranes). Training data are enriched for bacterial and archaeal proteins; performance can degrade for eukaryotic, viral, antibody, orphan, or heavily engineered sequences, especially when sequence identity to training data is <20%.
When this API is not ideal: For tasks requiring accurate Tm regression, fine-grained mutational scanning, or structure-aware stability assessment, TemBERTure via this API is not sufficient (the published TemBERTureTm regression model is not exposed as a separate
model_type). Use complementary stability regressors, ΔΔG tools, or structure models, optionally driven by embeddings from theencoderendpoint, and reserve the classifier for coarse thermophilic vs. non-thermophilic triage.
How We Use It¶
TemBERTure Classifier enables rapid, sequence-only assessment of thermophilic class and is used as a decision layer across protein design and optimization workflows. Its calibrated probability score helps triage variant libraries from generative models, gate temperature-aware regression ensembles, and guide assay design (for example, screening temperatures and host selection). Combined with structure-derived metrics (such as AlphaFold2 models, interface packing, Rosetta ΔΔG) and physicochemical features (charge, pI, hydrophobicity), the classifier accelerates downselection and focuses wet-lab effort on variants most likely to meet process temperature targets. Attention-derived residue saliency highlights mutational hot spots and stability motifs for targeted diversification, improving iteration speed in active-learning campaigns. Standardized, scalable APIs support high-throughput batch scoring and consistent feature logging into multi-objective ranking models used for enzyme design, antibody maturation, and broader developability risk reduction.
Upstream filter in generative loops to enforce thermostability constraints and raise hit quality before synthesis.
Routing signal for class-specific models (e.g. TemBERTureTm ensembles, solubility/aggregation predictors) and DOE planning at relevant temperature regimes.
Feature in multi-objective optimization alongside activity and expression, reducing experimental cycles to reach required Topt/Tm bands.
References¶
Rodella, C., Lazaridi, S., & Lemmin, T. (2024). TemBERTure: advancing protein thermostability prediction with deep learning and attention mechanisms. Bioinformatics Advances, 4(1), vbae103.
