What technology area does this patent fall under?

Primary CPC classification G16B40/20. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 17 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Methods and systems for improved major histocompatibility complex (MHC)-peptide binding prediction of neoepitopes using a recurrent neural network encoder and attention weighting

US11557375B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11557375-B2
Application number	US-201917059157-A
Country	US
Kind code	B2
Filing date	Aug 14, 2019
Priority date	Aug 20, 2018
Publication date	Jan 17, 2023
Grant date	Jan 17, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques are provided for predicting MHC-peptide binding affinity. A plurality of training peptide sequences is obtained, and a neural network model is trained to predict MHC-peptide binding affinity using the training peptide sequences. An encoder of the neural network model comprising an RNN is configured to process an input training peptide sequence to generate a fixed-dimension encoding output by applying a final hidden state of the RNN at intermediate state outputs of the RNN to generate attention weighted outputs, and linearly combining the attention weighted outputs. A fully connected layer following the encoder is configured to process the fixed-dimension encoding output to generate an MHC-peptide binding affinity prediction output. A computing device is configured to use the trained neural network to predict MHC-peptide binding affinity for a test peptide sequence.

First claim

Opening claim text (preview).

We claim: 1. A computing system-implemented method of predicting major histocompatibility complex (MHC)-peptide binding affinity, the method comprising: obtaining a plurality of training peptide sequences of variable length; training, by one or more computing devices, a recurrent neural network (RNN) model comprising at least one fully connected layer to predict MHC-peptide binding affinity with respect to an MHC allele sequence, wherein training the RNN model comprises, for each training peptide sequence of the plurality of training peptide sequences of variable length, iteratively: inputting the training peptide sequence into the RNN model; generating a fixed-dimension encoding output by processing the training peptide sequence, wherein the processing comprises applying a final hidden state of the RNN model at intermediate states of the RNN model and attention weighting to one or more positions of the training peptide sequence; generating an MHC-peptide binding affinity prediction output between the training peptide sequence and the MHC allele sequence by processing the fixed-dimension encoding output using the at least one fully connected layer of the RNN model; determining a loss factor by comparing the attention weighting to a known MHC-peptide binding affinity value corresponding to the training peptide sequence; and updating at least one parameter of a set of parameters of the RNN model based on the loss factor; inputting a test peptide sequence into the trained RNN model; and generating, by the trained RNN model, an MHC-peptide binding affinity prediction output for the test peptide sequence with respect to the MHC allele sequence. 2. The method of claim 1 , wherein applying the final hidden state at an intermediate state of the RNN model comprises taking a dot product, a weighted product, or other function, of the final hidden state and the intermediate state. 3. The method of claim 1 , further comprising applying weights learned through the training of the RNN model to the final hidden state prior to applying the final hidden state at intermediate states of the RNN model. 4. The method of claim 1 , further comprising concatenating the final hidden state with a final hidden state of an encoder of a second neural network model prior to applying the final hidden state at intermediate states of the RNN model. 5. The method of claim 4 , wherein the second neural network model is configured based on the set of parameters of the trained RNN model to predict MHC-peptide binding affinity for an MHC allele input. 6. The method of claim 1 , wherein the fixed-dimension encoding output generated by the RNN model comprises one or more positions each corresponding to an amino acid position of a training peptide sequence inputted into the RNN model. 7. The method of claim 6 , wherein each of the one or more positions of the fixed-dimension encoding output is a single value. 8. The method of claim 1 , wherein the RNN model comprises one of a Long Short Term Memory (LSTM) RNN and Gated Recurrent Unit (GRU) RNN or variant thereof. 9. The method of claim 1 , wherein the RNN model comprises a bidirectional RNN model. 10. The method of claim 9 , wherein the fixed-dimension encoding output is generated by concatenating outputs of the bidirectional RNN model. 11. The method of claim 1 , wherein the plurality of training peptide sequences of variable length comprises two or more sequence lengths. 12. The method of claim 1 , wherein the plurality of training peptide sequences is one of one-hot, BLOSUM, PAM, or learned embedding encoded. 13. The method of claim 1 , wherein each training peptide sequence of the plurality of training peptide sequences is between 6-20 amino acids in length. 14. The method of claim 1 , wherein each training peptide sequence of the plurality of training peptide sequences is between 10-30 amino acids in length. 15. The method of claim 1 , wherein each training peptide sequence of the plurality of training peptide sequences is a positive MHC-peptide binding example. 16. The method of claim 1 , wherein the test peptide sequence is between 6-20 amino acids in length. 17. The method of claim 1 , wherein the test peptide sequence is between 10-30 amino acids in length. 18. The method of claim 1 , wherein the test peptide sequence has a sequence length different from a sequence length of at least one of the plurality of training peptide sequences. 19. The method of claim 1 , wherein the test peptide sequence is one of one-hot, BLOSUM, PAM, or learned embedding encoded. 20. The method of claim 1 , wherein generating MHC-peptide binding affinity prediction output for the test peptide sequence comprises generating a single prediction value. 21. The method of claim 20 , wherein the single prediction value relates to a likelihood of activating a T-cell response to a tumor. 22. The method of claim 1 , wherein the at least one fully connected layer comprises two fully connected layers. 23. The method of claim 1 , wherein the at least one fully connected layer comprises one of a deep convolutional neural network, a residual neural network, a densely connected convolutional neural network, a fully convolutional neural network, or an RNN. 24. The method of claim 1 , wherein generating, by the trained RNN model, the MHC-peptide binding affinity prediction output for the test peptide sequence comprises: generating a fixed-dimension encoding output by processing the test peptide sequence, wherein the processing comprises applying a final hidden state of the RNN model at intermediate states of the RNN model and attention weighting to one or more positions of the test peptide sequence, and generating the MHC-peptide binding affinity prediction output by processing the fixed-dimension encoding output using the at least one fully connected layer of the trained RNN model. 25. A computer program product embedded in a non-transitory computer-readable medium comprising instructions executable by a computer processor for predicting major histocompatibility complex (MHC)-peptide binding affinity, which, when executed by a processor, cause the processor to perform one or more steps comprising: obtaining a plurality of training peptide sequences of variable length; training a recurrent neural network (RNN) model comprising at least one fully connected layer to predict MHC-peptide binding affinity with respect to an MHC allele sequence, wherein training the RNN model comprises, for each training peptide sequence of the plurality of training peptide sequences of variable length, iteratively: inputting the training peptide sequence into the RNN model; generating a fixed-dimension encoding output by processing the training peptide sequence, wherein the processing comprises applying a final hidden state of the RNN model at intermediate states of the RNN model and attention weighting to one or more positions of the training peptide sequence; generating an MHC-peptide binding affinity prediction output between the training peptide sequence and the MHC allele sequence by processing the fixed-dimension encoding output using the at least one fully connected layer of the RNN model; determining a loss factor by comparing the attention weighting to a known MHC-peptide binding affinity value corresponding to the training peptide sequence; and updating at least one parameter of a set of parameters of the RNN model based on the loss factor;

Assignees

Nantomics Llc

Inventors

Classifications

G16B40/20Primary
Supervised data analysis · CPC title
G06N3/08
Learning methods · CPC title
G16B15/30
Drug targeting using structural data; Docking or binding prediction · CPC title
G16B40/00Primary
ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding · CPC title
G16B5/20
Probabilistic models · CPC title

Patent family

Related publications grouped by family.

View patent family 69643983

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11557375B2 cover?: Techniques are provided for predicting MHC-peptide binding affinity. A plurality of training peptide sequences is obtained, and a neural network model is trained to predict MHC-peptide binding affinity using the training peptide sequences. An encoder of the neural network model comprising an RNN is configured to process an input training peptide sequence to generate a fixed-dimension encoding o…
Who is the assignee on this patent?: Nantomics Llc
What technology area does this patent fall under?: Primary CPC classification G16B40/20. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 17 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).