Predict Protein Contact Maps with ESM-2¶
Let's say you want to securely generate a fast but accurate contact map for visualization or ML. BioLM lets you predict the contact map data in seconds, no spinning up GPUs or installing programs and dependencies.
# Import the BioLM SDK
import time
from biolmai import BioLMDefine Endpoint Params¶
# Let's use the sequence from the paper
# https://www.biorxiv.org/content/10.1101/2021.07.09.450648v2.full.pdf
SEQ = "ASKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTLVTTFSYGVQCFSRYPDHMKRHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLADHYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITHGMDELYK"
params = {
            "include": [
                "mean",
                "contacts",
                "logits",
                "attentions"
            ]
    }
print("Sequence length: {}".format(len(SEQ)))Make API Request¶
Let's make a secure REST API request to BioLM API to quickly make the prediction on GPU.
start = time.time()
result = BioLM(entity="esm2-650m", action="encode", type="sequence", items=[SEQ], params=params)
end = time.time()
print(f"ESM2 contact map generation took {end - start:.4f} seconds.")There are keys containing:
- our 
contacts, which is alen(seq) x len(seq)matrix - the 
logitsfrom the final hidden state, which is a vector oflen(seq) - the 
attentionsor attention map, which islen(seq) x n_layers mean_representations, which are the protein embeddings which is a vector of1280- lastly 
sequence_index, which is simply the index of the sequence in the order it was POSTed 
contact_map = result['contacts']
# Straight from the model, this would be 223, 223 due to start/end tokens,
# but the endpoint cleans that up for us
nrow = len(contact_map)
ncol = len(contact_map[0])
print(f'({nrow}, {ncol})')from matplotlib import pyplot as pltplt.figure(figsize=(9, 9))
plt.xlabel('Residue')
plt.ylabel('Residue')
plt.title('Contact Map of Example Protein Using ESM2')
plt.imshow(contact_map, cmap='viridis', interpolation='nearest')
plt.show()See more use-cases and APIs on your BioLM Console Catalog.¶
BioLM hosts deep learning models and runs inference at scale. You do the science.¶
Contact us to learn more.¶
<span></span>