ESM-2 650M is a protein language model trained on evolutionary-scale sequence data, enabling generalization beyond natural protein families for de novo protein sequence generation and structure-conditioned design tasks. The model leverages sequence-only training to implicitly encode structural patterns, generating experimentally validated protein sequences (67% overall success rate). BioLM provides GPU-accelerated API access for high-throughput protein design, sequence optimization, and novel protein structure exploration workflows.

Predict

Predict masked amino acid tokens in the input sequences

python
from biolmai import BioLM
response = BioLM(
    entity="esm2-650m",
    action="predict",
    params={},
    items=[
      {
        "sequence": "MKT<mask>IALSYIFCLVFA"
      },
      {
        "sequence": "VLSP<mask>KAAW"
      }
    ]
)
print(response)
bash
curl -X POST https://biolm.ai/api/v3/esm2-650m/predict/ \
  -H "Authorization: Token YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "items": [
    {
      "sequence": "MKT<mask>IALSYIFCLVFA"
    },
    {
      "sequence": "VLSP<mask>KAAW"
    }
  ]
}'
python
import requests

url = "https://biolm.ai/api/v3/esm2-650m/predict/"
headers = {
    "Authorization": "Token YOUR_API_KEY",
    "Content-Type": "application/json"
}
payload = {
      "items": [
        {
          "sequence": "MKT<mask>IALSYIFCLVFA"
        },
        {
          "sequence": "VLSP<mask>KAAW"
        }
      ]
    }

response = requests.post(url, headers=headers, json=payload)
print(response.json())
r
library(httr)

url <- "https://biolm.ai/api/v3/esm2-650m/predict/"
headers <- c("Authorization" = "Token YOUR_API_KEY", "Content-Type" = "application/json")
body <- list(
  items = list(
    list(
      sequence = "MKT<mask>IALSYIFCLVFA"
    ),
    list(
      sequence = "VLSP<mask>KAAW"
    )
  )
)

res <- POST(url, add_headers(.headers = headers), body = body, encode = "json")
print(content(res))
POST /api/v3/esm2-650m/predict/

Predict endpoint for ESM-2 650M.

Request Headers:

Request

  • params (object, optional) — Configuration parameters:

    • repr_layers (array of integers, default: [-1]) — Layer indices for representation extraction

    • include (array of strings, default: [“mean”]) — Output types to include

      Allowed values:
      • mean

      • per_token

      • bos

      • contacts

      • logits

      • attentions

  • items (array of objects, min: 1, max: 8) — Input sequences:

    • sequence (string, min length: 1, max length: 2048, required) — Protein sequence using extended amino acid alphabet plus “-” character

  • items (array of objects, min: 1, max: 8) — Input sequences with masked tokens:

    • sequence (string, min length: 1, max length: 2048, required) — Protein sequence containing one or more occurrences of “<mask>” token

Example request:

http
POST /api/v3/esm2-650m/predict/ HTTP/1.1
Host: biolm.ai
Authorization: Token YOUR_API_KEY
Content-Type: application/json

      {
  "items": [
    {
      "sequence": "MKT<mask>IALSYIFCLVFA"
    },
    {
      "sequence": "VLSP<mask>KAAW"
    }
  ]
}
Status Codes:

Response

  • results (array of objects) — One result per input item, in the order requested:

    • sequence_index (int) — Index of the input sequence in the request array

    • embeddings (array of objects, optional) — Mean embeddings per requested layer:

      • layer (int) — Layer index from ESM2 model

      • embedding (array of floats, size: 1280) — Mean embedding vector for the sequence

    • bos_embeddings (array of objects, optional) — Beginning-of-sequence embeddings per requested layer:

      • layer (int) — Layer index from ESM2 model

      • embedding (array of floats, size: 1280) — Embedding vector for beginning-of-sequence token

    • per_token_embeddings (array of objects, optional) — Per-token embeddings per requested layer:

      • layer (int) — Layer index from ESM2 model

      • embeddings (array of arrays of floats, shape: [sequence_length, 1280]) — Embedding vectors per token position

    • contacts (array of arrays of floats, optional, shape: [sequence_length, sequence_length], range: 0.0-1.0) — Predicted inter-residue contact probabilities

    • attentions (array of arrays of floats, optional, shape: [sequence_length, sequence_length], range: 0.0-1.0) — Self-attention weights from the model

    • logits (array of arrays of floats, optional, shape: [sequence_length, vocab_size]) — Predicted per-token logits over vocabulary tokens

    • vocab_tokens (array of strings, optional, size: vocab_size) — Vocabulary tokens corresponding to logits indices

  • results (array of objects) — One result per input item, in the order requested:

    • logits (array of arrays of floats, shape: [num_masked_positions, vocab_size]) — Predicted logits for masked positions in the sequence

    • sequence_tokens (array of strings, size: sequence_length) — Tokens of the input sequence, including “<mask>” tokens

    • vocab_tokens (array of strings, size: vocab_size) — Vocabulary tokens corresponding to logits indices

Example response:

http
HTTP/1.1 200 OK
Content-Type: application/json

      {
  "results": [
    {
      "logits": [
        [
          0.26346755027770996,
          -1.0843636989593506,
          "... (truncated for documentation)"
        ],
        [
          0.22804699838161469,
          -0.11050359904766083,
          "... (truncated for documentation)"
        ],
        "... (truncated for documentation)"
      ],
      "sequence_tokens": [
        "M",
        "K",
        "... (truncated for documentation)"
      ],
      "vocab_tokens": [
        "L",
        "A",
        "... (truncated for documentation)"
      ]
    },
    {
      "logits": [
        [
          -0.4318189024925232,
          -0.31044864654541016,
          "... (truncated for documentation)"
        ],
        [
          3.3938562870025635,
          0.003626987338066101,
          "... (truncated for documentation)"
        ],
        "... (truncated for documentation)"
      ],
      "sequence_tokens": [
        "V",
        "L",
        "... (truncated for documentation)"
      ],
      "vocab_tokens": [
        "L",
        "A",
        "... (truncated for documentation)"
      ]
    }
  ]
}

Encode

Generate embeddings for input protein sequences

python
from biolmai import BioLM
response = BioLM(
    entity="esm2-650m",
    action="encode",
    params={
      "repr_layers": [
        -1,
        -2
      ],
      "include": [
        "mean",
        "per_token"
      ]
    },
    items=[
      {
        "sequence": "MKTIIALSYIFCLVFAD"
      },
      {
        "sequence": "VLSPADKTNVKAAW"
      }
    ]
)
print(response)
bash
curl -X POST https://biolm.ai/api/v3/esm2-650m/encode/ \
  -H "Authorization: Token YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "params": {
    "repr_layers": [
      -1,
      -2
    ],
    "include": [
      "mean",
      "per_token"
    ]
  },
  "items": [
    {
      "sequence": "MKTIIALSYIFCLVFAD"
    },
    {
      "sequence": "VLSPADKTNVKAAW"
    }
  ]
}'
python
import requests

url = "https://biolm.ai/api/v3/esm2-650m/encode/"
headers = {
    "Authorization": "Token YOUR_API_KEY",
    "Content-Type": "application/json"
}
payload = {
      "params": {
        "repr_layers": [
          -1,
          -2
        ],
        "include": [
          "mean",
          "per_token"
        ]
      },
      "items": [
        {
          "sequence": "MKTIIALSYIFCLVFAD"
        },
        {
          "sequence": "VLSPADKTNVKAAW"
        }
      ]
    }

response = requests.post(url, headers=headers, json=payload)
print(response.json())
r
library(httr)

url <- "https://biolm.ai/api/v3/esm2-650m/encode/"
headers <- c("Authorization" = "Token YOUR_API_KEY", "Content-Type" = "application/json")
body <- list(
  params = list(
    repr_layers = list(
      -1,
      -2
    ),
    include = list(
      "mean",
      "per_token"
    )
  ),
  items = list(
    list(
      sequence = "MKTIIALSYIFCLVFAD"
    ),
    list(
      sequence = "VLSPADKTNVKAAW"
    )
  )
)

res <- POST(url, add_headers(.headers = headers), body = body, encode = "json")
print(content(res))
POST /api/v3/esm2-650m/encode/

Encode endpoint for ESM-2 650M.

Request Headers:

Request

  • params (object, optional) — Configuration parameters:

    • repr_layers (array of integers, default: [-1]) — Representation layers to include from the model

    • include (array of strings, default: [“mean”]) — Output types to include:

      • “mean” — Mean embedding

      • “per_token” — Per-token embeddings

      • “bos” — Beginning-of-sequence embedding

      • “contacts” — Predicted inter-residue distances

      • “logits” — Predicted per-token logits

      • “attentions” — Self-attention weights

  • items (array of objects, min: 1, max: 8) — Input sequences:

    • sequence (string, min length: 1, max length: 2048, required) — Protein sequence using extended amino acid codes and “-” character

Example request:

http
POST /api/v3/esm2-650m/encode/ HTTP/1.1
Host: biolm.ai
Authorization: Token YOUR_API_KEY
Content-Type: application/json

      {
  "params": {
    "repr_layers": [
      -1,
      -2
    ],
    "include": [
      "mean",
      "per_token"
    ]
  },
  "items": [
    {
      "sequence": "MKTIIALSYIFCLVFAD"
    },
    {
      "sequence": "VLSPADKTNVKAAW"
    }
  ]
}
Status Codes:

Response

  • results (array of objects) — One result per input item, in the order requested:

    • sequence_index (int) — Index of the input sequence in the request items list

    • embeddings (array of objects, optional) — Mean embeddings per requested layer:

      • layer (int) — Layer index from ESM2 model

      • embedding (array of floats, size: 1280) — Mean embedding vector for the sequence

    • bos_embeddings (array of objects, optional) — Beginning-of-sequence embeddings per requested layer:

      • layer (int) — Layer index from ESM2 model

      • embedding (array of floats, size: 1280) — Embedding vector for the beginning-of-sequence token

    • per_token_embeddings (array of objects, optional) — Per-token embeddings per requested layer:

      • layer (int) — Layer index from ESM2 model

      • embeddings (array of arrays of floats, shape: [sequence_length, 1280]) — Embedding vectors for each token position

    • contacts (array of arrays of floats, optional, shape: [sequence_length, sequence_length], range: 0.0-1.0) — Predicted inter-residue contact probabilities

    • attentions (array of arrays of floats, optional, shape: [sequence_length, sequence_length], range: 0.0-1.0) — Self-attention weights between tokens

    • logits (array of arrays of floats, optional, shape: [sequence_length, vocab_size]) — Predicted per-token logits for each vocabulary token

    • vocab_tokens (array of strings, optional, size: vocab_size) — Vocabulary tokens corresponding to logits indices

Example response:

http
HTTP/1.1 200 OK
Content-Type: application/json

      {
  "results": [
    {
      "embeddings": [
        {
          "layer": 32,
          "embedding": [
            -0.07151295244693756,
            8.46182632446289,
            "... (truncated for documentation)"
          ]
        },
        {
          "layer": 33,
          "embedding": [
            0.026107627898454666,
            0.22847698628902435,
            "... (truncated for documentation)"
          ]
        }
      ],
      "per_token_embeddings": [
        {
          "layer": 32,
          "embeddings": [
            [
              3.1025631427764893,
              8.30557632446289,
              "... (truncated for documentation)"
            ],
            [
              -12.64371395111084,
              18.03836441040039,
              "... (truncated for documentation)"
            ],
            "... (truncated for documentation)"
          ]
        },
        {
          "layer": 33,
          "embeddings": [
            [
              0.0552901066839695,
              0.2175598442554474,
              "... (truncated for documentation)"
            ],
            [
              -0.10134288668632507,
              0.3057705760002136,
              "... (truncated for documentation)"
            ],
            "... (truncated for documentation)"
          ]
        }
      ]
    },
    {
      "embeddings": [
        {
          "layer": 32,
          "embedding": [
            -4.4823503494262695,
            6.506104469299316,
            "... (truncated for documentation)"
          ]
        },
        {
          "layer": 33,
          "embedding": [
            -0.04350338131189346,
            0.07114817947149277,
            "... (truncated for documentation)"
          ]
        }
      ],
      "per_token_embeddings": [
        {
          "layer": 32,
          "embeddings": [
            [
              -2.563758373260498,
              -1.9410854578018188,
              "... (truncated for documentation)"
            ],
            [
              11.576005935668945,
              0.6234054565429688,
              "... (truncated for documentation)"
            ],
            "... (truncated for documentation)"
          ]
        },
        {
          "layer": 33,
          "embeddings": [
            [
              -0.10343319922685623,
              0.03292176127433777,
              "... (truncated for documentation)"
            ],
            [
              0.11083731055259705,
              0.031194249168038368,
              "... (truncated for documentation)"
            ],
            "... (truncated for documentation)"
          ]
        }
      ]
    }
  ]
}

Performance

  • ESM-2 650M inference is accelerated on NVIDIA Tesla T4 GPUs, providing efficient processing for embedding and prediction tasks.

  • Typical inference latency is approximately 1-3 seconds per batch, depending on sequence length and requested embedding types.

  • Predictive accuracy and representation quality significantly outperform smaller ESM-2 variants (8M, 35M, 150M), offering improved precision in downstream tasks such as structure prediction, protein engineering, and functional annotation.

  • Compared to the smaller ESM-2 150M model, the 650M variant achieves approximately 15-20% higher accuracy in predicting inter-residue contacts and functional annotations, at the cost of slightly increased inference latency (approximately 1.5x slower).

  • ESM-2 650M provides robust embeddings suitable for downstream structure prediction tasks, offering higher accuracy than ESMFold, though at increased computational cost; however, it is significantly faster and more cost-effective than AlphaFold2.

  • Embedding outputs (mean, per-token, BOS), logits, attentions, and predicted inter-residue contacts are available, with minimal additional latency when requesting multiple output types simultaneously.

  • Contact map predictions from ESM-2 650M demonstrate high correlation (precision@L ~0.57) with experimentally determined protein structures, significantly outperforming smaller ESM-2 models (precision@L ~0.49 for 150M, ~0.35 for 35M) and providing practical utility for rapid structure-informed filtering and ranking of designed sequences.

  • Embedding dimensionality is 1280 per residue, providing rich representations that enhance downstream predictive performance compared to lower-dimensional models.

  • ESM-2 650M embeddings consistently outperform smaller ESM variants in downstream tasks such as zero-shot functional prediction, variant effect prediction, and sequence clustering, making it the recommended choice for high-performance sequence representation tasks.

Applications

  • Generation of novel protein sequences for therapeutic protein engineering, enabling the design of stable, soluble, and functionally optimized proteins by leveraging ESM-2 650M’s learned sequence-to-structure relationships; useful for companies engineering cytokines or growth factors, but not optimal for antibody or nanobody design due to lack of domain-specific training.

  • Fixed-backbone protein design for creating stable protein scaffolds, allowing biotech companies to reliably engineer proteins with desired structural motifs and binding pockets; valuable for developing novel binding proteins or biosensors, though not recommended for highly flexible or intrinsically disordered protein targets.

  • Unconstrained protein generation for exploring novel protein folds and topologies, enabling biotech researchers to identify unique structural motifs and scaffolds beyond natural protein families; beneficial for creating innovative protein therapeutics or biomaterials, but less suitable for applications requiring strict structural constraints or known functional domains.

  • Sequence-based filtering and ranking of candidate protein designs, providing biotech teams with a rapid computational assessment of protein stability and solubility prior to experimental validation; useful for prioritizing protein variants in high-throughput protein engineering pipelines, though not a replacement for rigorous lab-based biophysical characterization.

  • Identification and engineering of structural motifs such as helix capping or hydrogen-bond networks, enabling precise control over protein stability and folding properties; valuable for designing robust protein scaffolds or fusion proteins, but not optimal for predicting functional effects of mutations without additional experimental validation.

Limitations

  • Maximum Sequence Length: Input sequences must not exceed 2048 amino acids; longer sequences will need to be truncated or split into smaller segments.

  • Batch Size: The API supports a maximum batch size of 8 sequences per request; larger datasets must be processed in multiple batches.

  • The ESM-2 650M model is trained solely on natural protein sequences; while it generalizes well to many de novo proteins, performance may degrade for highly unnatural or extensively engineered sequences.

  • ESM-2 models do not explicitly model three-dimensional protein structures; structural predictions are inferred indirectly via attention maps (contacts) and may not achieve the accuracy of dedicated structure-prediction models (e.g., AlphaFold2, ESMFold).

  • The model architecture does not inherently enforce structural constraints, meaning generated sequences or embeddings might not always correspond to stable or experimentally viable proteins.

  • For tasks requiring explicit sequence embeddings (e.g., clustering, visualization), ensure the include parameter is set appropriately (mean, per_token, or bos); note that embeddings from causal language models (CausalLMs) are not available in this API.

How We Use It

BioLM integrates ESM-2 650M into protein design workflows to accelerate generation and optimization of novel protein sequences, enabling researchers to rapidly explore sequence space beyond naturally occurring proteins. Through scalable API access, users can seamlessly incorporate ESM-2 650M into predictive modeling and generative design pipelines, supporting tasks such as fixed-backbone sequence optimization and unconstrained de novo protein generation. This integration facilitates iterative cycles of design, filtering, and ranking in multi-round protein engineering projects, and complements predictive tools such as AlphaFold, structure-derived metrics, and biophysical property analyses.

  • Accelerates discovery of novel, experimentally viable protein designs

  • Integrates effectively with predictive modeling and structure-based optimization tools

References