ProGen2 OAS is a 764M-parameter antibody-focused autoregressive protein language model trained on 554M redundancy-reduced sequences from the Observed Antibody Space (OAS) database. This API variant generates heavy-chain variable-region sequences conditioned on N-terminal germline-like VH framework contexts, using configurable temperature and nucleus (top-p) sampling and user-defined length limits up to 512 residues. It also returns per-sequence log-likelihood summaries (sum and mean) for zero-shot style fitness ranking and antibody library design workflows.
Generate¶
Generate up to 2 antibody variable heavy chain sequences conditioned on an OAS-style VH germline framework context.
- POST /api/v3/progen2-oas/generate/¶
Generate endpoint for ProGen2 OAS.
- Request Headers:
Content-Type – application/json
Authorization – Token YOUR_API_KEY
Request
params (object, required) — Generation parameters:
temperature (float, range: 0.0-8.0, default: 0.8) — Sampling temperature
top_p (float, range: 0.0-1.0, default: 0.9) — Nucleus sampling probability
num_samples (int, range: 1-3, default: 1) — Number of sequences generated per input item
max_length (int, range: 12-512, default: 128) — Maximum length of each generated sequence in tokens
items (array of objects, min: 1, max: 1) — Input items:
context (string, min length: 1, max length: 512, required) — Amino acid context sequence with unambiguous residue codes
Example request:
- Status Codes:
200 OK – Successful response
400 Bad Request – Invalid input
500 Internal Server Error – Internal server error
Response
results (array of arrays) — One result per input item, in the order requested:
[i] (array of objects, length:
num_samples) — Generated samples for the i-th input itemsequence (string, length: 1–512) — Generated amino acid sequence (unambiguous amino acid alphabet)
ll_sum (float) — Sum of token log-likelihoods over
sequence(natural logarithm units)ll_mean (float) — Mean token log-likelihood over
sequence(natural logarithm units)
Example response:
Performance¶
Model variants and parameter scales:
progen2-oas: antibody-focused model aligned with PROGEN2‑medium (~764M parameters), trained on redundancy‑reduced OAS; optimized for antibody‑like generations and likelihood‑based library scoringprogen2-medium: ~764M‑parameter universal protein model; strong trade‑off between throughput and zero‑shot fitness ranking on narrow mutational landscapesprogen2-largeandprogen2-bfd90: ~2.7B‑parameter universal models; lower perplexity on natural sequences and improved performance on some wide/low‑homology fitness landscapes, withprogen2-bfd90generally strongest on out‑of‑distribution benchmarks
Hardware execution and decoding optimizations:
progen2-oasruns on 2 vCPUs with 8 GB RAM;progen2-mediumuses a single 8 GB T4‑class GPU;progen2-large/progen2-bfd90use a single 16 GB T4‑class GPU, matching their memory footprintsAll variants use batched autoregressive decoding with cached key/value attention states and fused attention/MLP kernels, keeping per‑token generation cost close to linear in sequence length for continuations up to 128 residues
Comparative speed and cost vs other BioLM sequence models:
Compared with encoder‑only scorers such as ESM‑2 650M/3B, ProGen2 has higher per‑token compute due to strict autoregression, but for ≤128‑residue continuations the end‑to‑end latency is similar because it avoids repeated masked scoring of full sequences
Compared with larger sequence‑to‑sequence transformers such as Evo 1.5 8k Base and Evo 2 1B Base, ProGen2 is typically faster per generated residue and cheaper for large variant‑library scoring, as it is decoder‑only with smaller memory footprint and no long cross‑attention contexts
Zero‑shot fitness and design behavior across sizes:
On narrow DMS landscapes, ~764M‑parameter models (
progen2-medium) achieve the best average Spearman correlations (~0.50), matching or exceeding some larger encoder baselinesOn wide or low‑homology landscapes (AAV, GFP, CM, GB1), larger variants (
progen2-large,progen2-bfd90) better recover high‑fitness tail variants, with clear gains on epistatic GB1 top‑variant discoveryFor antibody‑related tasks,
progen2-oasimproves developability‑relevant statistics (aggregation, solubility) for generated libraries, while universal ProGen2 models generally achieve higher rank‑ordering accuracy for global antibody properties such as expression quality and melting temperature
Applications¶
Generation of diverse, human-like antibody VH sequence libraries for discovery campaigns using PROGEN2-OAS, enabling pharma and biotech teams to go beyond naïve or immunized animal repertoires while still matching natural OAS-like sequence statistics; particularly valuable for building heavy-chain variable-domain libraries with realistic CDR length patterns and framework usage, but not a drop-in replacement for target-specific panning and downstream functional screening
In silico exploration of developability-relevant VH sequence neighborhoods (e.g., aggregation propensity, solubility, stability proxies) by generating batches of PROGEN2-OAS VH variants around a lead framework/CDR context and ranking them with external developability predictors, allowing therapeutic teams to de-risk liabilities (such as hydrophobic CDR patches) before cell-line development; useful for improving expression and solubility, but not guaranteed to preserve antigen binding without experimental validation
Zero-shot prioritization of VH variants in affinity maturation or library-mining campaigns by using PROGEN2-OAS log-likelihood scores (ll_sum or ll_mean) to rank candidate CDR and framework mutations, helping teams triage large mutational sets down to tractable panels for wet-lab screening; particularly useful when DMS data are limited, though the model does not see antigen context and therefore cannot replace structure-based or binding-assay-driven design
Rapid generation of species- and isotype-adapted VH variants (e.g., humanization-like workflows) by conditioning PROGEN2-OAS on tailored N-terminal framework prompts reflecting species and chain type, enabling CROs and biotech companies to steer variable domains toward human-like repertoires while maintaining key CDR motifs; effective for moving sequences toward human-like space, but final liabilities and immunogenicity still require orthogonal in silico and experimental assessment
Construction of realistic synthetic VH benchmarking sets for internal analytics and ML tooling by sampling large, non-redundant repertoires from PROGEN2-OAS that mirror OAS diversity, giving bioinformatics and data science teams high-quality test beds for annotation, clustering, paratope prediction, and sequence-analytics pipelines without relying solely on proprietary or patient-derived datasets
Limitations¶
Maximum sequence length: Input
contextmust be between 1 and512amino acids (min_length=1,max_length=512). Longer sequences are rejected; truncate or window sequences client-side before calling thegeneratorendpoint. Generatedsequenceoutputs are also capped bymax_length(ge=12,le=512), so you cannot autoregress beyond position 512 in a single request.Batch size and sampling limits: Each
generatorrequest can include at most1item initems(max_items=1). Each item can request at most3generated sequences vianum_samples(ge=1,le=3). Large libraries must be created via many (possibly parallel) API calls; this is not a “millions of sequences per call” service.Generation controls and scores:
temperature(0.0–8.0) andtop_p(0.0–1.0) only control diversity of amino acids in the outputsequence. They do not enforce binding, stability, developability, or other biophysical constraints.ll_sumandll_meanare log-likelihood scores under the ProGen2-OAS language model and are useful only for relative ranking within the samemodel_type="oas"and identicalparams, not as calibrated fitness, affinity, or liability predictions.Antibody- and data-specific bias:
model_type="oas"is trained solely on OAS antibody variable fragments and reproduces that distribution, including artifacts like frequent N-terminal truncations in the training data. It is not a general-purpose protein generator and is unsuitable for non-antibody proteins, non-Ig folds, or highly engineered scaffolds far outside immune-repertoire space.Use cases where ProGen2 OAS is not optimal: This API exposes a single-chain, sequence-only causal language model. It does not return embeddings, structures, paired heavy–light designs, or multi-chain complexes. It is not the right choice when you need structure prediction, sequence embeddings for clustering/visualization, conditioning on antigen or backbone structure, or strict control over global properties (e.g., solubility, aggregation) without downstream filters or models.
Scientific and zero-shot limitations: Although ProGen2 variants can correlate with fitness on some benchmarks,
ll_meanfrom theoasmodel does not reliably predict experimental fitness, affinity, or developability, especially for out-of-distribution antibodies, aggressive mutational scans, or antigen-specific optimization. For tasks like final candidate selection, epistatic landscape exploration, or joint structure–sequence design, plan to combine this API with specialized models and experimental screening.
How We Use It¶
ProGen2 OAS enables rapid in silico exploration of antibody VH sequence space as a generative engine within closed-loop discovery and optimization campaigns. Standardized APIs generate up to three VH variants per framework context, which teams then route through structure prediction (e.g., IgFold, AlphaFold-based models), developability scoring (aggregation, solubility, charge, liabilities), and zero-shot fitness predictors to filter, rank, and select sequences for synthesis. By wiring ProGen2 OAS into assay data pipelines, LIMS, and sequence analytics tools, experimental data from each round can be fed back into campaign-specific models to reduce experimental burden and systematically move toward antibodies with improved binding, stability, and manufacturability profiles.
Integrated with other BioLM generative and predictive models to co-optimize sequence novelty, fitness, and developability for antibody leads.
Used in multi-round, lab-in-the-loop campaigns where batched ProGen2 OAS generations are programmatically scored, triaged, and advanced to synthesis through standardized API-driven pipelines.
References¶
Nijkamp, E., Ruffolo, J., Weinstein, E. N., Naik, N., & Madani, A. (2023). ProGen2: Exploring the Boundaries of Protein Language Models. arXiv:2309.16590. https://doi.org/10.48550/arXiv.2309.16590
