Synthetic-to-realistic image conversion using generative adversarial network (gan) or other machine learning model
US-2024428568-A1 · Dec 26, 2024 · US
US2024249506A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2024249506-A1 |
| Application number | US-202318158950-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jan 24, 2023 |
| Priority date | Jan 24, 2023 |
| Publication date | Jul 25, 2024 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In some embodiments, apparatuses and methods are provided herein useful to labeling objects in captured images. In some embodiments, there is provided a system for labeling objects in images captured at a product storage facility including a control circuit and a user interface. The control circuit is configured to select a set of unprocessed images; receive a selected configuration based on data resulting from iteratively processing the set of unprocessed images; cluster each unprocessed image into a corresponding group based on the selected configuration; select a plurality of clustered images from each of the plurality of groups; and output the plurality of clustered images from each group. The user interface is configured to: display each clustered image; and receive a user input labeling one or more objects shown in each clustered image resulting in a labeled dataset used to train a machine learning model.
Opening claim text (preview).
What is claimed is: 1 . A system for labeling objects in images captured at a product storage facility, the system comprising: a control circuit configured to: select a set of unprocessed images from a plurality of unprocessed images of objects captured at the product storage facility; receive a selected configuration based on data resulting from iteratively processing the set of unprocessed images based on at least one of a pretrained model, a feature extraction layer of the pretrained model, and a type of clustering; cluster each unprocessed image of the plurality of unprocessed images into a corresponding group of a plurality of groups based on the selected configuration; select a plurality of clustered images from each of the plurality of groups; and output the plurality of clustered images from each group; and a user interface operable on an electronic device and configured to: display each of the plurality of clustered images; and receive a user input labeling one or more objects shown in each of the plurality of clustered images resulting in a labeled dataset comprising a set of labeled images, wherein the control circuit is further configured to train a machine learning model based on the labeled dataset. 2 . The system of claim 1 , wherein the control circuit is further configured to subsequently select a next plurality of clustered images from each of the plurality of groups, wherein the user interface is further configured to: display each of the next plurality of clustered images; and receive a next user input labeling one or more objects shown in each of the next plurality of clustered images resulting in next labeled dataset comprising a next set of labeled images; and wherein the control circuit is further configured to train the trained machine learning model based on the next labeled dataset until a threshold number of labeled datasets have been used to train the trained machine learning model. 3 . The system of claim 2 , wherein the control circuit is further configured to: select a second plurality of clustered images from each of the plurality of groups; and automatically label using the trained machine learning model one or more objects shown in each of the second plurality of clustered images resulting in automatically labeled set of images, wherein the user interface is further configured to: display each image of the automatically labeled set of images; and receive a second user input relabeling mislabeled objects of the one or more objects shown in each of the second plurality of clustered images resulting in a correctly labeled set of images; and wherein the control circuit is further configured to train the trained machine learning model based on the correctly labeled set of images. 4 . The system of claim 1 , further comprising: one or more image capture devices configured to capture the plurality of unprocessed images of objects at the product storage facility; and a database configured to store the plurality of unprocessed images. 5 . The system of claim 4 , wherein at least one of the one or more image capture devices is coupled to a motorized robotic unit. 6 . The system of claim 1 , wherein the objects comprise items for sale and price tags. 7 . The system of claim 1 , wherein the user interface comprises a graphical user interface used by a user to associate each of the objects shown in each of the plurality of clustered images to a corresponding product. 8 . The system of claim 1 , wherein the control circuit is configured to: process the plurality of unprocessed images by being configured to: detect objects within the plurality of unprocessed images; enclose each detected object inside a bounding box; and classify each detected object as being potentially associated with a plurality of corresponding candidate product identifiers. 9 . The system of claim 8 , wherein the control circuit is further configured to: output at least one detected object to the user interface with the plurality of corresponding candidate product identifiers; receive a second user input via the user interface indicating a correct product identifier selected from the plurality of corresponding candidate product identifiers to associate with the at least one detected object; and train the trained machine learning model with a processed image including the at least one detected object associated with the correct product identifier. 10 . The system of claim 8 , further comprising: a database configured to store: a plurality of processed images, wherein each processed image shows at least one object inside the bounding box indicating the at least one object has been detected in the processed image; text associated with corresponding product identifiers; and a plurality of stored product images associated with the corresponding product identifiers, wherein the control circuit in classifying each detected object as potentially associated with the plurality of corresponding candidate product identifiers is further configured to: compare, using the trained machine learning model, detected text in a bounded object of a processed image with the text associated with the corresponding product identifiers to determine a first set of matches, wherein each match of the first set of matches is associated with a first corresponding probability value and a first respective product identifier of the match; compare, using the trained machine learning model, one or more detected visual images of the bounded object with the plurality of stored product images to determine a second set of matches, wherein each match of the second set of matches is associated with a second corresponding probability value and a second respective product identifier of the match; and determine, using the trained machine learning model, a third set of matches, wherein the third set of matches are those matches in the first set of matches and the second set of matches that are associated with probability values that are greater than a threshold value. 11 . The system of claim 10 , wherein the control circuit is further configured to: determine that not a single probability value in the first set of matches and the second set of matches is greater than the threshold value; receive a third user input via the user interface, the third user input comprising one or more words associated with the bounded object of the processed image; search product identifiers associated with the bounded object using the third user input; output the product identifiers associated with the bounded object to the user interface; receive a fourth user input via the user interface associating a correct product identifier selected from the product identifiers associated with the bounded object; and train the trained machine learning model with a processed image including the bounded object associated with the correct product identifier. 12 . The system of claim 8 , wherein the control circuit uses another trained machine learning model to detect the objects and enclose each detected object inside the bounding box, and wherein the other trained machine learning model is distinct from the machine learning model. 13 . The system of claim 8 , wherein the plurality of unprocessed images are images that have not gone through objection detection or object classification by the control circuit. 14 . A method for labeling objects in images captured at a product storage facility, the method comprising: selecting, by a control circuit, a set of unprocessed images from a plurality of unprocessed images of objects capt
using classification, e.g. of video objects · CPC title
using neural networks · CPC title
Validation; Performance evaluation · CPC title
Labelling scene content, e.g. deriving syntactic or semantic representations · CPC title
Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.