What technology area does this patent fall under?

Primary CPC classification G06F40/30. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue May 03 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Automated labeling of images to train machine learning

US11322256B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11322256-B2
Application number	US-201816205224-A
Country	US
Kind code	B2
Filing date	Nov 30, 2018
Priority date	Nov 30, 2018
Publication date	May 3, 2022
Grant date	May 3, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method, computer system, and a computer program product for automatic labeling to train a machine learning algorithm is provided. The present invention may include labeling a medical image with at least one finding from a corresponding medical report. The present invention may include determining a localization information from the labeled medical image. The present invention may include training the machine learning algorithm with the determined localization information. The present invention may include detecting at least one candidate in a test medical image. The present invention may include generating a discrepancy list between the at least one detected candidate in the test medical image and at least one human-reported finding in a corresponding test medical report. The present invention may include, in response to determining that the generated discrepancy list is above a threshold, retraining the trained machine learning algorithm until the generated discrepancy list is below the threshold.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for automatic labeling to train a machine learning algorithm, the method comprising: detecting at least one first finding in a medical report and at least one first candidate in a corresponding medical image; interpreting a geometric description of an anatomical location of the detected at least one first finding in the medical report; identifying, using an association algorithm, at least one true finding from the detected at least one first candidate in the corresponding medical image based on the interpreted geometric description of the anatomical location of the detected at least one finding in the medical report; locating, in a sub-region of the corresponding medical image, the detected at least one first candidate based on the interpreted geometric description of the anatomical location of the detected at least one first finding in the medical report; electronically marking the located at least one first candidate in the sub-region of the corresponding medical image; generating a ground truth label by labeling, in a natural language, the electronically marked at least one first candidate in the sub-region of the corresponding medical image with the detected at least one first finding in the medical report; training the machine learning algorithm with the generated ground truth label; detecting, using the trained machine learning algorithm, at least one second candidate in a test medical image, wherein the at least one detected second candidate in the test medical image is associated with predicting at least one second finding from a corresponding test medical report; generating, using the trained machine learning algorithm, a discrepancy list between the at least one detected second candidate in the test medical image and the at least one second finding in the corresponding test medical report; and in response to determining that the generated discrepancy list is above a threshold, retraining the trained machine learning algorithm until the generated discrepancy list is below the threshold. 2. The method of claim 1 , further comprising: determining at least one true finding from the detected at least one first candidate in the corresponding medical image, by association with the detected at least one first finding from the medical report. 3. The method of claim 1 , further comprising: determining a ground truth in the detected at least one first finding from the medical report; and generating the ground truth label for the identified at least one true finding in the corresponding medical image based on the determined ground truth from the medical report. 4. The method of claim 3 , wherein generating the ground truth label for the identified at least one true finding in the corresponding medical image further comprises: electronically marking, using a labeling component, at least one pixel indicating a contour of the identified at least one true finding in the corresponding medical image. 5. The method of claim 3 , wherein generating the ground truth label for the identified at least one true finding in the corresponding medical image further comprises: providing an electronic label associated with at least one medical report data, wherein the at least one medical report data is selected from the group consisting of: at least one radiology report data and at least one pathology report data. 6. A computer system for automatic labeling to train a machine learning algorithm, comprising: one or more processors, one or more computer-readable memories, one or more computer-readable tangible storage media, and program instructions stored on at least one of the one or more computer-readable tangible storage media for execution by at least one of the one or more processors via at least one of the one or more memories, wherein the computer system is capable of performing a method comprising: detecting at least one first finding in a medical report and at least one first candidate in a corresponding medical image; interpreting a geometric description of an anatomical location of the detected at least one first finding in the medical report; identifying, using an association algorithm, at least one true finding from the detected at least one first candidate in the corresponding medical image based on the interpreted geometric description of the anatomical location of the detected at least one finding in the medical report; locating, in a sub-region of the corresponding medical image, the detected at least one first candidate based on the interpreted geometric description of the anatomical location of the detected at least one first finding in the medical report; electronically marking the located at least one first candidate in the sub-region of the corresponding medical image; generating a ground truth label by labeling, in a natural language, the electronically marked at least one first candidate in the sub-region of the corresponding medical image with the detected at least one first finding in the medical report; training the machine learning algorithm with the generated ground truth label; detecting, using the trained machine learning algorithm, at least one second candidate in a test medical image, wherein the at least one detected second candidate in the test medical image is associated with predicting at least one second finding from a corresponding test medical report; generating, using the trained machine learning algorithm, a discrepancy list between the at least one detected second candidate in the test medical image and the at least one second finding in the corresponding test medical report; and in response to determining that the generated discrepancy list is above a threshold, retraining the trained machine learning algorithm until the generated discrepancy list is below the threshold. 7. The computer system of claim 6 , further comprising: determining at least one true finding from the detected at least one first candidate in the corresponding medical image, by association with the detected at least one first finding from the medical report. 8. The computer system of claim 6 , further comprising: determining a ground truth in the detected at least one first finding from the medical report; and generating the ground truth label for the identified at least one true finding in the corresponding medical image based on the determined ground truth from the medical report. 9. The computer system of claim 8 , wherein generating the ground truth label for the identified at least one true finding in the corresponding medical image further comprises: electronically marking, using a labeling component, at least one pixel indicating a contour of the identified at least one true finding in the corresponding medical image. 10. The computer system of claim 8 , wherein generating the ground truth label for the identified at least one true finding in the corresponding medical image further comprises: providing an electronic label associated with at least one medical report data, wherein the at least one medical report data is selected from the group consisting of: at least one radiology report data and at least one pathology report data. 11. A computer program product for automatic labeling to train a machine learning algorithm, comprising: one or more computer-readable tangible storage media and program instructions stored on at least one of the one or more computer-readable tangible storage media, the program instructions executable by a processor to cause the processor to perform a method comprising: detecting at least one first finding in a medical report and at least one first candidate in a corresponding medical image; interpr

Assignees

Inventors

Classifications

G06F40/205
Parsing · CPC title
G06F40/30Primary
Semantic analysis · CPC title
G16H15/00
ICT specially adapted for medical reports, e.g. generation or transmission thereof · CPC title
G06F40/284
Lexical analysis, e.g. tokenisation or collocates · CPC title
G16H30/40
for processing medical images, e.g. editing · CPC title

Patent family

Related publications grouped by family.

View patent family 70850373

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11322256B2 cover?: A method, computer system, and a computer program product for automatic labeling to train a machine learning algorithm is provided. The present invention may include labeling a medical image with at least one finding from a corresponding medical report. The present invention may include determining a localization information from the labeled medical image. The present invention may include trai…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G06F40/30. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue May 03 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Automated extraction of structured labels from medical text using deep convolutional networks and use thereof to train a computer vision model

A system and method for automated labeling and annotating unstructured medical datasets

Overlay Of Findings On Image Data

Knowledge-based automatic image segmentation

Frequently asked questions