Weakly supervised learning with whole slide images
US-11954596-B2 · Apr 9, 2024 · US
US12112475B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12112475-B2 |
| Application number | US-202217659914-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 20, 2022 |
| Priority date | Nov 11, 2021 |
| Publication date | Oct 8, 2024 |
| Grant date | Oct 8, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Provided is a method and system for predicting tumor mutation burden (TMB) in triple negative breast cancer (TNBC) based on nuclear scores and histopathological whole slide images (WSIs). The method includes the following steps: first, screening the histopathological WSIs of TNBC; calculating a TMB value of each patient according to gene mutation of each patient with TNBC, and dividing the TMB values into two groups with high and low TMB according to a set threshold; dividing the histopathological WSIs of TNBC into patches of a set size; screening a certain number of patches with high nuclear scores according to a nuclear score function; then building a convolutional neural network (CNN) classification model, and stochastically initializing parameters in the CNN classification model; and finally, putting the screened patches into the built CNN classification model for training, so as to automatically predict high or low TMB with the histopathological WSIs of TNBC.
Opening claim text (preview).
What is claimed is: 1. A method for predicting tumor mutation burden (TMB) in triple-negative breast cancer (TNBC) based on nuclear scores and histopathological whole slide images (WSIs), comprising the following steps: S 1 : screening the histopathological WSIs of TNBC from histopathological images of breast cancer; S 2 : calculating a TMB value of each patient according to gene mutation of each patient with TNBC, and dividing the TMB values into two groups with high and low TMB according to a set threshold, denoted as TMB-H group and TMB-L group respectively, as a label corresponding to the WSI of each patient; S 3 : dividing the WSIs into patches of a set size and performing preprocessing; S 4 : screening patches with the nuclear scores meeting a threshold from the preprocessed patches according to a nuclear score function; S 5 : building a convolutional neural network (CNN) classification model, and stochastically initializing parameters in the CNN classification model; S 6 : standardizing color of the patches with the nuclear scores meeting the threshold, and inputting the patches after color standardization and corresponding labels into the CNN classification model to train a TMB classifier, wherein each patch belongs to the corresponding WSI, and the label corresponding to the patch is the label of the WSI corresponding to the patch; S 7 : predicting the TMB in the TNBC using the trained TMB classifier. 2. The method for predicting TMB in TNBC according to claim 1 , wherein a process of calculating a TMB value of each patient according to gene mutation of each patient with TNBC in step S 2 comprises: dividing tumors with nonsynonymous mutations in a somatic protein coding region of the patient by a total length of the protein coding region to obtain the TMB value of each patient, in mutations/mb, to characterize a density of distribution of nonsynonymous mutations in the protein coding region. 3. The method for predicting TMB in TNBC according to claim 1 , wherein when the TMB values are divided into two groups with high and low TMB in step S 2 , a median division method is used, and the threshold is recorded as M, and when the TMB value of the patient is greater than M, the patient is in the TMB-H group, otherwise, the patient is in the TMB-L group. 4. The method for predicting TMB in TNBC according to claim 1 , wherein step S 3 comprises: first, selecting the number of layers of WSI, and saving the images of the set size successively based on this layer, so as to cut the image into patches; second, removing blank and irregular patches from the cut patches, wherein a method for removing the blank patches is: calculating a pixel mean of each patch, and when the pixel mean of the patch is less than the set threshold, retaining the patch, otherwise discarding the patch; a method for removing the irregular patches is: calculating whether each patch has a length and width equal to a set patch size, and if the length and width are equal to the set patch size, retaining the patch, otherwise discarding the patch. 5. The method for predicting TMB in TNBC according to claim 1 , wherein step S 4 comprises: S 4 . 1 : converting an RGB image to Holistically-Nested Edge Detection (HED) space and extracting a value of an Hue (H) channel; S 4 . 2 : generating a preliminary mask and a mask for cleaning with the value of the H channel respectively, wherein the preliminary mask is obtained through multi-level image threshold division on the H channel, and the mask for cleaning is obtained by multi-level image threshold division and morphological transformation operations on the H channel; S 4 . 3 : subtracting the preliminary mask from the mask for cleaning to obtain a nucleus mask; S 4 . 4 : calculating a nuclear ratio N t of each patch, wherein the nuclear ratio is a ratio of the number of non-zero pixels in the mask of the nucleus to the total number of pixels in the mask; S 4 . 5 : generating a mask of a tissue area; S 4 . 6 : calculating a tissue ratio T t , wherein the tissue ratio is a ratio of the number of non-zero pixels in the mask of the tissue area to the total number of pixels in the entire mask; S 4 . 7 : calculating the nuclear score s t of each patch through the nuclear score function based on the nuclear ratio and the tissue ratio T t of each patch; S 4 . 8 : sorting the obtained nuclear scores, and screening the patches with the nuclear scores meeting the threshold. 6. The method for predicting TMB in TNBC according to claim 5 , wherein the nuclear score function in step S 4 . 7 is: s t =N t ·tanh( T t ), 0≤ s t <1 wherein s t represents a nuclear score of a 1-th patch, and N t represents a nuclear ratio on the patch t, T t represents a tissue ratio on the patch t, and the patch t represents the t-th patch. 7. The method for predicting TMB in TNBC according to claim 1 , wherein the CNN classification model in step S 5 uses a CNN having an eighteen-layer architecture, as a feature extraction module, and modifies output of the last fully connected layer to 2. 8. The method for predicting TMB in TNBC according to claim 1 , wherein in step S 6 , an optimal value of the model is found according to the loss function and the gradient descent method during training, the cross-entropy loss function is used as the loss function, and the adaptive momentum estimation algorithm (Adam) is used as the gradient descent method. 9. A system for predicting tumor mutation burden (TMB) in triple-negative breast cancer (TNBC) based on nuclear scores and histopathological whole slide images (WSIs), comprising a processor and a memory storing program codes, wherein the processor performs the stored program codes to: screen the histopathological WSIs of TNBC from histopathological images of breast cancer; calculate a TMB value of each patient according to gene mutation of each patient with TNBC, and divide the TMB values into two groups with high and low TMB according to a set threshold, denoted as TMB-H group and TMB-L group respectively, as a label corresponding to each WSI; cut the WSIs into patches of a set size and performing preprocessing, and screen patches with the nuclear scores meeting a threshold from the preprocessed patches according to a nuclear score function; build a CNN classification model, stochastically initialize parameters in the CNN classification model, and standardize color of the patches with the nuclear scores meeting the threshold, and input the patches after color standardization and corresponding labels into the CNN classification model to train a TMB classifier, wherein each patch belongs to the corresponding WSI, and the label corresponding to the patch is the label of the WSI corresponding to the patch; predict the TMB in the TNBC using the trained TMB classifier. 10. The system for predicting TMB in TNBC according to claim 9 , wherein the processor further performs the stored program codes to generate a visual report of prediction results and the corresponding WSI.
Tumor; Lesion · CPC title
Mammography; Breast · CPC title
Cell structures in vitro; Tissue sections in vitro · CPC title
Artificial neural networks [ANN] · CPC title
Training; Learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.