Generating training data for machine learning
US-2023282012-A1 · Sep 7, 2023 · US
US12469262B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12469262-B2 |
| Application number | US-202418633170-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 11, 2024 |
| Priority date | Apr 27, 2023 |
| Publication date | Nov 11, 2025 |
| Grant date | Nov 11, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In some embodiments, a method sends information for a sample of content, a first question, and a second question for output on an interface. The first question receives, from a subject, a first response for a sample level rating for an artifact that is perceived to be visible in the sample and the second question receives, from the subject, a second response for regions in the sample that are perceived to contain the artifact. The method receives the first response for the sample level rating and the second response for regions that are perceived to contain the artifact. First responses are combined from multiple subjects to generate an opinion score for the sample and second responses are combined to generate region scores for regions. The method generates training data from the opinion score and the region scores to train a process to perform an action based on the artifacts.
Opening claim text (preview).
What is claimed is: 1 . A method comprising: sending information for a sample of content, a first question, and a second question for output on an interface, wherein the first question is configured to receive, from a subject, a first response for a sample level rating for an artifact that is perceived to be visible in the sample of content and the second question is configured to receive, from the subject, a second response for one or more regions in a plurality of regions in the sample of content that are perceived to contain the artifact; receiving the first response for the sample level rating and the second response for one or more regions that are perceived to contain the artifact; combining first responses for the first question from multiple subjects to generate an opinion score for the sample of content and combining second responses for the second question from the multiple subjects to generate region scores for regions in the plurality of regions; and generating training data from the opinion score and the region scores to train a process to perform an action based on the artifacts in one or more regions in the sample of content, wherein generating training data comprises: determining a cropped patch of a portion of the sample of content; determining regions that are included in the cropped patch; and determining a patch score based on region scores for the regions that are included in the cropped patch. 2 . The method of claim 1 , further comprising: determining a dataset for a subjective assessment to be performed by the subject; and retrieving the sample of content from the dataset. 3 . The method of claim 2 , further comprising: sending multiple samples of content from the dataset for output on the interface; receiving first responses and second responses for respective samples of content in the multiple samples of content; and generating training data for the samples of content using the respective first responses and second responses. 4 . The method of claim 1 , wherein: the first response comprises a value from a range for the sample level rating, and the second response comprises identifiers for the one or more regions that are selected. 5 . The method of claim 1 , wherein combining the first responses from multiple subjects comprises: generating a mean of the sample level ratings from the first responses. 6 . The method of claim 1 , wherein combining the second responses from multiple subjects comprises: generating a value based on a number of subjects that select respective regions. 7 . The method of claim 6 , wherein the value is a percentage of subjects that selected respective regions. 8 . The method of claim 1 , wherein the region scores are represented in a heat map that displays the respective region scores in association with the regions. 9 . The method of claim 1 , wherein generating training data comprises: associating the region scores for regions in the sample of content with the opinion score. 10 . The method of claim 1 , further comprising: using the patch score to train the process. 11 . The method of claim 1 , wherein determining the patch score comprises: using a weighted average of the region scores for the regions that are included in the cropped patch based on a proportion of area associated with each of the regions in the cropped patch. 12 . The method of claim 1 , wherein determining the patch score comprises: selecting one of the region scores to be the patch score. 13 . The method of claim 1 , wherein determining the patch score comprises: determining a proportion of area for a region that has a higher region score, wherein the higher region score indicates more artifacts were perceived to be visible; when the proportion of area meets a threshold, selecting the patch score based a region score for the region that has the higher region score, and when the proportion of area does not meet the threshold, selecting the patch score based on a region score for a region with a lower region score. 14 . The method of claim 1 , wherein determining the patch score comprises: determining a proportion of area for a plurality of regions; when a proportion of area for a first region that has a highest region score meets a threshold, selecting the patch score based the region score for the region that has the highest region score, wherein the higher region score indicates more artifacts were perceived to be visible, when the proportion of area for two regions that have a highest region score meets the threshold, selecting the patch score based on region scores for two regions, when the proportion of area for three regions that have a highest region score meets the threshold, selecting the patch score based on region scores for three regions, and when the proportion of area for three regions that have the highest region score does not meet the threshold, selecting the patch score based on a region score for a region with a lowest region score, wherein the lowest region score indicates less artifacts were perceived to be visible. 15 . A non-transitory computer-readable storage medium having stored thereon computer executable instructions, which when executed by a computing device, cause the computing device to be operable for: sending information for a sample of content, a first question, and a second question for output on an interface, wherein the first question is configured to receive, from a subject, a first response for a sample level rating for an artifact that is perceived to be visible in the sample of content and the second question is configured to receive, from the subject, a second response for one or more regions in a plurality of regions in the sample of content that are perceived to contain the artifact; receiving the first response for the sample level rating and the second response for one or more regions that are perceived to contain the artifact; combining first responses for the first question from multiple subjects to generate an opinion score for the sample of content and combining second responses for the second question from the multiple subjects to generate region scores for regions in the plurality of regions; and generating training data from the opinion score and the region scores to train a process to perform an action based on the artifacts in one or more regions in the sample of content, wherein generating training data comprises: determining a cropped patch of a portion of the sample of content; determining regions that are included in the cropped patch, and determining a patch score based on region scores for the regions that are included in the cropped patch. 16 . A method comprising: outputting a sample of content, a first question, and a second question on an interface; receiving, from a subject, a first response to the first question for a sample level rating for an artifact that is perceived to be visible in the sample of content that is output on the interface; receiving, from the subject, a second response to the second question for one or more regions in a plurality of regions in the sample of content that are perceived to contain the artifact; and sending the first response and the second response to a server system, wherein first responses from multiple subjects are combined to generate an opinion score for the sample of content from the first responses and second responses from the multiple subjects are combined to generate region scores for regions in the plurality of regions, and training data is generated from the opinion score and the
Image quality inspection · CPC title
Training; Learning · CPC title
Video; Image sequence · CPC title
Inspection of images, e.g. flaw detection · CPC title
Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.