Image Processing Model Training Method and Apparatus
US-2022366254-A1 · Nov 17, 2022 · US
US12536779B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12536779-B2 |
| Application number | US-202318358274-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 25, 2023 |
| Priority date | Jul 25, 2023 |
| Publication date | Jan 27, 2026 |
| Grant date | Jan 27, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method, computer system, and program product facilitate identification of error image labels in training data. The method comprises: evenly dividing a training dataset into N subsets, where the training dataset includes M data items each comprising a pair of image and its original image label; training a prediction model to label images by respectively using each of the N subsets as training data to generate N respective trained prediction models; respectively using each of the N trained prediction models trained by using one of the N subsets as training data to label the images in other N−1 subsets of the N subsets to generate N−1 prediction labels for each of the M images in the training dataset. For each image in the M data items, whether the original image label of the image is a potential error image label is based on the N−1 prediction labels of the image.
Opening claim text (preview).
What is claimed is: 1 . A computer-implemented method comprising, using one or more processing units: evenly dividing a training dataset into N subsets, wherein the training dataset includes M data items, each comprising a pair comprising an image and its original image label, wherein N is an integer greater than 1 and M is an integer greater than 0; training a prediction model to label images by respectively using each of the N subsets as training data to generate N trained prediction models; respectively using each of the N trained prediction models trained by using one of the N subsets as training data to label the images in other N−1 subsets of the N subsets to generate N−1 prediction labels for each of the M images in the training dataset; and for each image in the M data items, determining whether the original image label of the image is a potential error image label is based on the N−1 prediction labels of the image. 2 . The computer-implemented method of claim 1 , wherein: the data items in the training dataset are categorized according to the original image labels in the data items; and the category distribution of data items in each of the N subsets follows the category distribution of data items in the training dataset. 3 . The computer-implemented method of claim 1 , further comprising: adjusting the sizes of the subsets to improve prediction accuracy of the trained prediction models. 4 . The computer-implemented method of claim 3 , further comprising: selecting the sizes of the subsets such that the variance of the prediction accuracy of the trained prediction models is minimized. 5 . The computer-implemented method of claim 1 , wherein determining whether the original image label of the image is a potential error image label based on the N−1 prediction labels of the image comprises: in response to the original image label of the image being not consistent with one of the N−1 prediction labels of the image, determining the original image label to be a potential error image label. 6 . The computer-implemented method of claim 1 , wherein determining whether the original image label of the image is a potential error image label based on the N−1 prediction labels of the image comprises: in response to one of the N−1 prediction labels being not consistent with another of the N−1 prediction labels, determining the original image label to be a potential error image label. 7 . A computer system comprising: one or more computer processors; one or more computer readable media; and program instructions, stored on the one or more computer readable media for execution by at least one of the one or more processors, wherein the program instructions are configured to performing the following operations: evenly dividing a training dataset into N subsets, wherein the training dataset includes M data items each comprising a pair comprising an image and its original image label, wherein N is an integer greater than 1 and M is an integer greater than 0; training a prediction model to label images by respectively using each of the N subsets as training data to generate N respective trained prediction models; respectively using each of the N trained prediction models trained by using one of the N subsets as training data to label the images in other N−1 subsets of the N subsets to generate N−1 prediction labels for each of the M images in the training dataset; and for each image in the M data items, whether the original image label of the image is a potential error image label is based on the N−1 prediction labels of the image. 8 . The computer system of the claim 7 , wherein: the data items in the training dataset are categorized according to original image labels in the data items; and the category distribution of data items in each of the N subsets follows the category distribution of data items in the training dataset. 9 . The computer system of the claim 7 , wherein the operations further comprise: adjusting the sizes of the subsets to improve prediction accuracy of the trained prediction models. 10 . The computer system of the claim 7 , wherein the operations further comprise: selecting the sizes of the subsets such that the variance of the prediction accuracy of the trained prediction models is minimized. 11 . The computer system of the claim 7 , wherein determining whether the original image label of the image is a potential error image label based on the N−1 prediction labels of the image comprises: in response to the original image label of the image being not consistent with one of the N−1 prediction labels of the image, determining the original image label to be a potential error image label. 12 . The computer system of the claim 7 , wherein determining whether the original image label of the image is a potential error image label based on the N−1 prediction labels of the image comprises: in response to one of the N−1 prediction labels being not consistent with another of the N−1 prediction labels, determining the original image label to be a potential error image label. 13 . A computer program product comprising: one or more computer readable storage media; and program instructions, stored on the one or more computer readable storage media for execution by at least one of the one or more processors, wherein the program instructions are configured to performing the following operations: evenly dividing a training dataset into N subsets, wherein the training dataset includes M data items each comprising a pair comprising an image and its original image label, wherein N is an integer greater than 1 and M is an integer greater than 0; training a prediction model to label images by respectively using each of the N subsets as training data to generate N respective trained prediction models; respectively using each of the N trained prediction models trained by using one of the N subsets as training data to label the images in other N−1 subsets of the N subsets to generate N−1 prediction labels for each of the M images in the training dataset; and for each image in the M data items, determining whether the original image label of the image is a potential error image label based on the N−1 prediction labels of the image. 14 . The computer program product of the claim 13 , wherein: the data items in the training dataset are categorized according to original image labels in the data items; and the category distribution of data items in each of the N subsets follows the category distribution of data items in the training dataset. 15 . The computer program product of the claim 13 , wherein the operations further comprise: adjusting the sizes of the subsets to improve prediction accuracy of the trained prediction models. 16 . The computer program product of the claim 15 , wherein the operations further comprise: selecting the sizes of the subsets may be chosen such that the variance of the prediction accuracy of the trained prediction models is minimized. 17 . The computer program product of the claim 13 , wherein determining whether the original image label of the image is a potential error image label based on the N−1 prediction labels of the image comprises: in response to the original image label of the image being not consistent with one of the N−1 prediction labels of the image, determining the original image label to be a potential error image label. 18 . The computer program product of the claim 13 , wherein determining whether the original image label of the image is
Detection or correction of errors, e.g. by rescanning the pattern or by human intervention; Evaluation of the quality of the acquired patterns · CPC title
Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.