ESM-2 8M

ESM-2 8M is a GPU-accelerated protein language model for atomic-level 3D structure prediction directly from amino acid sequences, without requiring multiple sequence alignments or templates. Using transformer-based masked language modeling trained on UniRef data, ESM-2 8M enables fast (up to ~60x speedup) single-sequence inference with confidence metrics (pLDDT, pTM) and PDB-format outputs. Typical applications include protein engineering, metagenomic annotation, and high-throughput structural biology workflows.

Predict¶

Perform masked amino acid prediction for input protein sequences with ESM-2 8M.

Python (biolmai)

python

from biolmai import BioLM
response = BioLM(
    entity="esm2-8m",
    action="predict",
    params={},
    items=[
      {
        "sequence": "ACD<mask>FGHI"
      },
      {
        "sequence": "MNOP<mask>RST"
      }
    ]
)
print(response)

cURL

bash

curl -X POST https://biolm.ai/api/v3/esm2-8m/predict/ \
  -H "Authorization: Token YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "items": [
    {
      "sequence": "ACD<mask>FGHI"
    },
    {
      "sequence": "MNOP<mask>RST"
    }
  ]
}'

Python Requests

python

import requests

url = "https://biolm.ai/api/v3/esm2-8m/predict/"
headers = {
    "Authorization": "Token YOUR_API_KEY",
    "Content-Type": "application/json"
}
payload = {
      "items": [
        {
          "sequence": "ACD<mask>FGHI"
        },
        {
          "sequence": "MNOP<mask>RST"
        }
      ]
    }

response = requests.post(url, headers=headers, json=payload)
print(response.json())

library(httr)

url <- "https://biolm.ai/api/v3/esm2-8m/predict/"
headers <- c("Authorization" = "Token YOUR_API_KEY", "Content-Type" = "application/json")
body <- list(
  items = list(
    list(
      sequence = "ACD<mask>FGHI"
    ),
    list(
      sequence = "MNOP<mask>RST"
    )
  )
)

res <- POST(url, add_headers(.headers = headers), body = body, encode = "json")
print(content(res))

POST /api/v3/esm2-8m/predict/¶

Predict endpoint for ESM-2 8M.

Request Headers:

Content-Type – application/json
Authorization – Token YOUR_API_KEY

Request

params (object, optional) — Configuration parameters:
- repr_layers (array of integers, default: [-1]) — Indices of representation layers to return embeddings from
- include (array of strings, default: [“mean”]) — Types of outputs to include; allowed values: “mean”, “per_token”, “bos”, “contacts”, “logits”, “attentions”
items (array of objects, min: 1, max: 8) — Input sequences:
- sequence (string, min length: 1, max length: 2048, required) — Protein sequence using standard amino acid codes, extended amino acid codes, or “-” character for gaps

Example request:

http

POST /api/v3/esm2-8m/predict/ HTTP/1.1
Host: biolm.ai
Authorization: Token YOUR_API_KEY
Content-Type: application/json

      {
  "items": [
    {
      "sequence": "ACD<mask>FGHI"
    },
    {
      "sequence": "MNOP<mask>RST"
    }
  ]
}

Status Codes:

200 OK – Successful response
400 Bad Request – Invalid input
500 Internal Server Error – Internal server error

Response

results (array of objects) — One result per input item, in the order requested:
- sequence_index (int) — Index of the input sequence in the original request (zero-based)
- embeddings (array of objects, optional) — Mean sequence embeddings per requested layer:
  - layer (int) — Layer index from ESM-2 model (range: -1 to total layers - 1)
  - embedding (array of floats, size: 320–5120) — Mean embedding vector for the sequence; size depends on model variant:
    - “8m”: 320 dimensions
    - “35m”: 480 dimensions
    - “150m”: 640 dimensions
    - “650m”: 1280 dimensions
- bos_embeddings (array of objects, optional) — Beginning-of-sequence (BOS) embeddings per requested layer:
  - layer (int) — Layer index from ESM-2 model (range: -1 to total layers - 1)
  - embedding (array of floats, size: 320–5120) — BOS embedding vector; size depends on model variant:
    - “8m”: 320 dimensions
    - “35m”: 480 dimensions
    - “150m”: 640 dimensions
    - “650m”: 1280 dimensions
- per_token_embeddings (array of objects, optional) — Per-residue token embeddings per requested layer:
  - layer (int) — Layer index from ESM-2 model (range: -1 to total layers - 1)
  - embeddings (array of arrays of floats, shape: [sequence_length, embedding_size]) — Embeddings for each residue token in sequence; embedding size depends on model variant:
    - “8m”: 320 dimensions
    - “35m”: 480 dimensions
    - “150m”: 640 dimensions
    - “650m”: 1280 dimensions
- contacts (array of arrays of floats, optional, shape: [sequence_length, sequence_length]) — Predicted inter-residue contact probabilities (range: 0.0–1.0)
- attentions (array of arrays of floats, optional, shape: [num_attention_heads, sequence_length, sequence_length]) — Attention weights from the model (range: 0.0–1.0); number of attention heads depends on model variant:
  - “8m”: 20 heads
  - “35m”: 20 heads
  - “150m”: 20 heads
  - “650m”: 20 heads
- logits (array of arrays of floats, optional, shape: [sequence_length, vocab_size]) — Predicted logits per residue token for each vocabulary token (unbounded real values)
- vocab_tokens (array of strings, optional, size: vocab_size) — Vocabulary tokens corresponding to logits; included only when logits are requested
results (array of objects) — One result per input item, in the order requested:
- logits (array of arrays of floats, shape: [num_masked_positions, vocab_size]) — Predicted logits for each masked position in input sequence (unbounded real values)
- sequence_tokens (array of strings, size: sequence_length) — Tokens of the input sequence, including predicted tokens at masked positions
- vocab_tokens (array of strings, size: vocab_size) — Vocabulary tokens corresponding to logits

Example response:

http

HTTP/1.1 200 OK
Content-Type: application/json

      {
  "results": [
    {
      "logits": [
        [
          -0.27711379528045654,
          2.309023141860962,
          "... (truncated for documentation)"
        ],
        [
          0.20582802593708038,
          -0.13699881732463837,
          "... (truncated for documentation)"
        ],
        "... (truncated for documentation)"
      ],
      "sequence_tokens": [
        "A",
        "C",
        "... (truncated for documentation)"
      ],
      "vocab_tokens": [
        "L",
        "A",
        "... (truncated for documentation)"
      ]
    },
    {
      "logits": [
        [
          0.1745593547821045,
          -0.662738025188446,
          "... (truncated for documentation)"
        ],
        [
          0.5286823511123657,
          0.3695230185985565,
          "... (truncated for documentation)"
        ],
        "... (truncated for documentation)"
      ],
      "sequence_tokens": [
        "M",
        "N",
        "... (truncated for documentation)"
      ],
      "vocab_tokens": [
        "L",
        "A",
        "... (truncated for documentation)"
      ]
    }
  ]
}

Encode¶

Generate embeddings for input protein sequences with ESM-2 8M.

Python (biolmai)

python

from biolmai import BioLM
response = BioLM(
    entity="esm2-8m",
    action="encode",
    params={
      "repr_layers": [
        -1,
        -2
      ],
      "include": [
        "mean",
        "per_token"
      ]
    },
    items=[
      {
        "sequence": "ACDEFGHIKLMN"
      },
      {
        "sequence": "QRSTVWYACDEFG"
      }
    ]
)
print(response)

cURL

bash

curl -X POST https://biolm.ai/api/v3/esm2-8m/encode/ \
  -H "Authorization: Token YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "params": {
    "repr_layers": [
      -1,
      -2
    ],
    "include": [
      "mean",
      "per_token"
    ]
  },
  "items": [
    {
      "sequence": "ACDEFGHIKLMN"
    },
    {
      "sequence": "QRSTVWYACDEFG"
    }
  ]
}'

Python Requests

python

import requests

url = "https://biolm.ai/api/v3/esm2-8m/encode/"
headers = {
    "Authorization": "Token YOUR_API_KEY",
    "Content-Type": "application/json"
}
payload = {
      "params": {
        "repr_layers": [
          -1,
          -2
        ],
        "include": [
          "mean",
          "per_token"
        ]
      },
      "items": [
        {
          "sequence": "ACDEFGHIKLMN"
        },
        {
          "sequence": "QRSTVWYACDEFG"
        }
      ]
    }

response = requests.post(url, headers=headers, json=payload)
print(response.json())

library(httr)

url <- "https://biolm.ai/api/v3/esm2-8m/encode/"
headers <- c("Authorization" = "Token YOUR_API_KEY", "Content-Type" = "application/json")
body <- list(
  params = list(
    repr_layers = list(
      -1,
      -2
    ),
    include = list(
      "mean",
      "per_token"
    )
  ),
  items = list(
    list(
      sequence = "ACDEFGHIKLMN"
    ),
    list(
      sequence = "QRSTVWYACDEFG"
    )
  )
)

res <- POST(url, add_headers(.headers = headers), body = body, encode = "json")
print(content(res))

POST /api/v3/esm2-8m/encode/¶

Encode endpoint for ESM-2 8M.

Request Headers:

Content-Type – application/json
Authorization – Token YOUR_API_KEY

Request

params (object, optional) — Configuration parameters:
- repr_layers (array of integers, default: [-1]) — Indices of representation layers to output embeddings from
- include (array of strings, default: [“mean”]) — Types of outputs to include; allowed values: “mean”, “per_token”, “bos”, “contacts”, “logits”, “attentions”
items (array of objects, min: 1, max: 8) — Input sequences:
- sequence (string, min length: 1, max length: 2048, required) — Protein sequence using standard amino acid codes plus “-” character for gaps

Example request:

http

POST /api/v3/esm2-8m/encode/ HTTP/1.1
Host: biolm.ai
Authorization: Token YOUR_API_KEY
Content-Type: application/json

      {
  "params": {
    "repr_layers": [
      -1,
      -2
    ],
    "include": [
      "mean",
      "per_token"
    ]
  },
  "items": [
    {
      "sequence": "ACDEFGHIKLMN"
    },
    {
      "sequence": "QRSTVWYACDEFG"
    }
  ]
}

Status Codes:

200 OK – Successful response
400 Bad Request – Invalid input
500 Internal Server Error – Internal server error

Response

results (array of objects) — One result per input item, in the order requested:
- sequence_index (int) — Index of the input sequence in the original request (0-based)
- embeddings (array of objects, optional) — Mean sequence embeddings per requested layer:
  - layer (int) — Layer index from ESM-2 model (-1 for final layer)
  - embedding (array of floats, size: 320, 480, 640, or 1280 depending on model size) — Mean embedding vector for the layer
- bos_embeddings (array of objects, optional) — Beginning-of-sequence embeddings per requested layer:
  - layer (int) — Layer index from ESM-2 model (-1 for final layer)
  - embedding (array of floats, size: 320, 480, 640, or 1280 depending on model size) — Embedding vector for the beginning-of-sequence token
- per_token_embeddings (array of objects, optional) — Per-token embeddings per requested layer:
  - layer (int) — Layer index from ESM-2 model (-1 for final layer)
  - embeddings (array of arrays of floats, shape: [sequence_length, embedding_size]) — Embedding vectors for each token in the sequence; embedding_size is 320, 480, 640, or 1280 depending on model size
- contacts (array of arrays of floats, optional, shape: [sequence_length, sequence_length], range: 0.0–1.0) — Predicted inter-residue contact probabilities; symmetric matrix with values indicating probability of contact between residue pairs
- attentions (array of arrays of floats, optional, shape: [sequence_length, sequence_length], range: 0.0–1.0) — Self-attention weights from the final layer; symmetric matrix indicating attention between residue pairs
- logits (array of arrays of floats, optional, shape: [sequence_length, vocab_size]) — Predicted logits for each token position; vocab_size is 33 (20 standard amino acids, plus special tokens)
- vocab_tokens (array of strings, optional, length: 33) — Vocabulary tokens corresponding to logits indices; included only when logits are requested

Example response:

http

HTTP/1.1 200 OK
Content-Type: application/json

      {
  "results": [
    {
      "embeddings": [
        {
          "layer": 5,
          "embedding": [
            -0.7343788743019104,
            -0.5336151123046875,
            "... (truncated for documentation)"
          ]
        },
        {
          "layer": 6,
          "embedding": [
            0.16363729536533356,
            -0.18132680654525757,
            "... (truncated for documentation)"
          ]
        }
      ],
      "per_token_embeddings": [
        {
          "layer": 5,
          "embeddings": [
            [
              -0.1730840802192688,
              0.5365392565727234,
              "... (truncated for documentation)"
            ],
            [
              -1.2964400053024292,
              -0.04409468173980713,
              "... (truncated for documentation)"
            ],
            "... (truncated for documentation)"
          ]
        },
        {
          "layer": 6,
          "embeddings": [
            [
              0.3278985023498535,
              0.3302983343601227,
              "... (truncated for documentation)"
            ],
            [
              0.037568263709545135,
              -0.08198162913322449,
              "... (truncated for documentation)"
            ],
            "... (truncated for documentation)"
          ]
        }
      ]
    },
    {
      "embeddings": [
        {
          "layer": 5,
          "embedding": [
            -0.24226808547973633,
            -0.8612385988235474,
            "... (truncated for documentation)"
          ]
        },
        {
          "layer": 6,
          "embedding": [
            0.10104967653751373,
            -0.20004889369010925,
            "... (truncated for documentation)"
          ]
        }
      ],
      "per_token_embeddings": [
        {
          "layer": 5,
          "embeddings": [
            [
              0.7492956519126892,
              -2.3160605430603027,
              "... (truncated for documentation)"
            ],
            [
              -0.3973373472690582,
              -0.8978397250175476,
              "... (truncated for documentation)"
            ],
            "... (truncated for documentation)"
          ]
        },
        {
          "layer": 6,
          "embeddings": [
            [
              0.43441492319107056,
              -0.1285461038351059,
              "... (truncated for documentation)"
            ],
            [
              0.13970370590686798,
              -0.30551478266716003,
              "... (truncated for documentation)"
            ],
            "... (truncated for documentation)"
          ]
        }
      ]
    }
  ]
}

Performance¶

ESM-2 8M inference is optimized for GPU acceleration, typically deployed on NVIDIA T4 GPUs.
Typical inference latency is significantly faster compared to larger ESM-2 variants (35M, 150M, 650M); for example, the 8M model completes inference approximately 5-10x faster than the 150M variant on the same hardware.
Predictive accuracy (measured by perplexity) of ESM-2 8M is lower compared to larger ESM-2 variants:
- Validation perplexity on UniRef50: 10.33 (8M) vs. 7.75 (150M), 6.95 (650M).
- Unsupervised contact prediction accuracy (long-range precision at L) is lower for ESM-2 8M (0.17) compared to ESM-2 150M (0.44) and ESM-2 650M (0.52).
ESM-2 8M offers faster inference speed but lower accuracy compared to BioLM’s larger ESM-2 models; suitable for rapid prototyping or tasks where speed is prioritized over precision.
ESM-2 8M is significantly faster than structural prediction models such as ESMFold or AlphaFold2, as it provides embeddings and sequence-level predictions without computationally expensive structure inference steps.
Input: Single amino acid sequence (up to 2048 residues); supports masked tokens (“<mask>”) for residue prediction tasks.
Output: Depending on request parameters, returns mean embeddings, per-token embeddings, predicted logits, attention maps, and residue-residue contact probabilities.

Applications¶

Rapid generation of protein structure predictions directly from amino acid sequences, enabling biotech companies to quickly screen candidate proteins for stability and structural integrity without relying on time-consuming multiple sequence alignments (MSAs); ideal for high-throughput protein engineering pipelines, though accuracy may be lower for orphan proteins with no evolutionary homologs.
Initial structure-based filtering of designed protein libraries, allowing protein engineering teams to efficiently prioritize candidates by predicting structural feasibility and identifying potential misfolding early in the design cycle; particularly useful for synthetic biology companies working on novel proteins where experimental structural data is unavailable.
High-speed structural annotation of metagenomic protein sequences, providing biotech researchers with structural insights into novel proteins identified from environmental samples; valuable for bioprospecting efforts aimed at discovering new proteins or domains for industrial or therapeutic applications, although predictions for very large or complex proteins may be less reliable.
Identification of structural homologs for proteins lacking significant sequence similarity, enabling protein engineers to infer potential functions and guide experimental characterization of novel proteins; particularly useful for companies exploring novel protein scaffolds or domains, though functional predictions based solely on structural similarity should be experimentally validated.
Structure-based clustering and diversity analysis of large protein datasets, helping biotech companies select structurally diverse protein candidates to maximize coverage of functional space in enzyme or protein engineering projects; beneficial for directed evolution workflows, but limited in ability to accurately predict subtle functional differences among structurally similar proteins.

Limitations¶

Maximum Sequence Length: Input sequences are limited to 2048 amino acids; longer sequences must be truncated or split into smaller segments.
Batch Size: API requests support a maximum batch_size of 8 sequences per call; larger datasets must be processed in multiple batches.
The 8m parameter model variant provides lower accuracy for structure prediction and embeddings compared to larger ESM-2 models (e.g., 650m); consider larger models for more accurate predictions.
ESM-2 models rely solely on single-sequence inputs without multiple sequence alignments (MSAs), potentially reducing accuracy compared to MSA-based methods (e.g., AlphaFold2) for proteins with few evolutionary homologs.
Predictions for proteins with very low evolutionary depth or highly novel sequences (e.g., orphan proteins) may be less reliable; consider using MSA-based methods for such targets.
The ESM-2 API does not provide explicit confidence metrics (e.g., pLDDT scores) for structure predictions; users should independently assess prediction confidence, particularly for downstream applications requiring high reliability.

How We Use It¶

The ESM-2 8M model enables rapid and cost-effective structure predictions directly from protein sequences, facilitating protein engineering and design workflows without reliance on external evolutionary databases or multiple sequence alignments. By integrating ESM-2 8M predictions into broader BioLM workflows, researchers can efficiently screen and prioritize protein variants based on predicted structural properties, accelerating design cycles and experimental validation. Practical outcomes include streamlined enzyme and antibody design, optimization, and maturation efforts, particularly when combined with downstream predictive models assessing biophysical properties, thermodynamic stability, or 3D structural metrics.

Enables efficient initial filtering of protein variants prior to advanced modeling or laboratory testing
Integrates seamlessly with downstream predictive algorithms for comprehensive protein evaluation

References¶

Lin, Z., Akin, H., Rao, R., Hie, B., Zhu, Z., Lu, W., Smetanin, N., Verkuil, R., Kabeli, O., Shmueli, Y., dos Santos Costa, A., Fazel-Zarandi, M., Sercu, T., Candido, S., & Rives, A. (2022). Evolutionary-scale prediction of atomic level protein structure with a language model. Science.

All Models

ESM-2 8M

Capabilities

Postman

Predict¶

Request

Response

Encode¶

Request

Response

Performance¶

Applications¶

Limitations¶

How We Use It¶

References¶

Accelerate yourLead generation

All Models

ESM-2 8M

Capabilities

Postman

Predict¶

Request

Response

Encode¶

Request

Response

Performance¶

Applications¶

Limitations¶

How We Use It¶

Related¶

References¶

Accelerate yourLead generation