Using machine learning to filter monte carlo noise from images
US-2016321523-A1 · Nov 3, 2016 · US
US9594983B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9594983-B2 |
| Application number | US-201414449821-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 1, 2014 |
| Priority date | Aug 2, 2013 |
| Publication date | Mar 14, 2017 |
| Grant date | Mar 14, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A sequence of images depicting an object is captured, e.g., by a camera at a point-of-sale terminal in a retail store. The object is identified, such as by a barcode or watermark that is detected from one or more of the images. Once the object's identity is known, such information is used in training a classifier (e.g., a machine learning system) to recognize the object from others of the captured images, including images that may be degraded by blur, inferior lighting, etc. In another arrangement, such degraded images are processed to identify feature points useful in fingerprint-based identification of the object. Feature points extracted from such degraded imagery aid in fingerprint-based recognition of objects under real life circumstances, as contrasted with feature points extracted from pristine imagery (e.g., digital files containing label artwork for such objects). A great variety of other features and arrangements—some involving designing classifiers so as to combat classifier copying—are also detailed.
Opening claim text (preview).
The invention claimed is: 1. A method comprising the acts: receiving first and second image data corresponding to first and second image frames both captured by a first camera of a point of sale system in a retail store, the point of sale system being able to recognize an item from the first image frame but not being able to recognize an item from the second image; recognizing the item from the first image frame, and determining a GTIN (Global Trade Item Number) identifier corresponding thereto; and using backpropagation, adjusting weights in a multi-layer neural network that includes plural convolutional layers and max-pooling layers, topped by plural classification layers, the adjusted weights causing the network to respond to presentation of the second image data at its input by outputting the GTIN determined from the first image data. 2. The method of claim 1 in which said item is depicted differently in said first and second image frames due to different viewing angle, focus and/or lighting. 3. The method of claim 1 wherein said first and second image frames were captured under red colored illumination. 4. The method of claim 1 wherein the second image frame comprises a degraded depiction of said item, by reason of blur, motion artifacts, specular reflection or inferior lighting, wherein the multi-layer neural network weights are adjusted to recognize said item based on degraded depictions thereof. 5. The method of claim 1 that includes: decoding data from a barcode depicted in the first image frame to determine the GTIN therefrom. 6. The method of claim 5 in which the second image frame comprises a degraded depiction of said item, by reason of being impaired by blur, motion artifacts, specular reflection or inferior lighting, wherein the multi-layer neural network weights are adjusted to recognize said item based on degraded depictions thereof. 7. The method of claim 1 in which said item is depicted differently in said first and second image frames due to different viewing angles. 8. The method of claim 1 in which said item is depicted differently in said first and second image frames due to different focus. 9. The method of claim 1 in which said item is depicted differently in said first and second image frames due to different lighting. 10. The method of claim 1 wherein the second image frame comprises a degraded depiction of said item, by reason of blur, wherein the multi-layer neural network weights are adjusted to recognize said item based on blurred depictions thereof. 11. The method of claim 1 wherein the second image frame comprises a degraded depiction of said item, by reason of motion artifacts, wherein the multi-layer neural network weights are adjusted to recognize said item based on depictions thereof including motion artifacts. 12. The method of claim 1 wherein the second image frame comprises a degraded depiction of said item, by reason of specular reflection, wherein the multi-layer neural network weights are adjusted to recognize said item based on depictions thereof including specular reflection. 13. The method of claim 1 wherein the second image frame comprises a degraded depiction of said item, by reason of inferior lighting, wherein the multi-layer neural network weights are adjusted to recognize said item based on depictions thereof captured with inferior lighting. 14. An apparatus comprising: a retail point-of-sale terminal including a first camera; a recognition module coupled to receive image data from the first camera, said recognition module comprising means for processing a first frame of image data to produce corresponding GTIN (Global Trade Item Number) identification data corresponding to a retail item depicted therein, but being unable to process a second frame of image data to produce corresponding GTIN identification information corresponding to a retail item depicted therein and a multi-layer neural network having inputs coupled to an output of the camera system, and coupled to an output of the digital watermark decoder or barcode decoder, the multi-layer neural network including plural convolutional layers and max-pooling layers, topped by plural classification layers, operation of the network being configured by weighting data that causes the network to respond to presentation of the second frame of image data at its input by outputting the GTIN decoded from the first frame of image data. 15. The method of claim 1 : in which the first image frame, but not the second image frame, includes a decodable machine-readable symbology that encodes a plural-bit payload including a GTIN identifier, by which the point of sale system is able to recognize the item in the first image frame; and in which the method includes decoding the machine-readable symbology from the first image frame to produce the GTIN identifier. 16. The apparatus of claim 14 in which the recognition module comprises a barcode decoder. 17. The apparatus of claim 14 in which the recognition module comprises a watermark decoder.
Incorporation of unlabelled data, e.g. multiple instance learning [MIL] · CPC title
Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
based on distances to training or reference patterns · CPC title
characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling · CPC title
Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.