Item put and take detection using image recognition

US10133933B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-10133933-B1
Application numberUS-201815907112-A
CountryUS
Kind codeB1
Filing dateFeb 27, 2018
Priority dateAug 7, 2017
Publication dateNov 20, 2018
Grant dateNov 20, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and techniques are provided for tracking puts and takes of inventory items by subjects in an area of real space. A plurality of cameras with overlapping fields of view produce respective sequences of images of corresponding fields of view in the real space. A processing system is coupled to the system. In one embodiment, the processing system comprises image recognition engines receiving corresponding sequences of images from the plurality of cameras. The image recognition engines process the images in the corresponding sequences to identify subjects represented in the images and generate classifications of the identified subjects. The system processes the classifications of identified subjects for sets of images in the sequences of images to detect takes and puts of inventory items on shelves by identified subjects.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for tracking puts and takes of inventory items by subjects in an area of real space, comprising: a plurality of cameras, cameras in the plurality of cameras producing respective sequences of images of corresponding fields of view in the real space, the field of view of each camera overlapping with the field of view of at least one other camera in the plurality of cameras; a processing system coupled to the plurality of cameras, the processing system including a plurality of image recognition engines, receiving corresponding sequences of images from the plurality of cameras, image recognition engines in the plurality of image recognition engines processing the images in the corresponding sequences to identify subjects represented in the images; and logic to process sets of images in the sequences of images that include the identified subjects to detect takes of inventory items by identified subjects and puts of inventory items on shelves by identified subjects, wherein the logic to process sets of images includes: for identified subjects, logic to process images to generate classifications of the images of the identified subjects, the classifications including whether the identified subject is holding an inventory item, a first nearness classification indicating a location of a hand of the identified subject relative to a shelf, a second nearness classification indicating a location a hand of the identified subject relative to the identified subject. 2. The system of claim 1 , wherein the second nearness classification indicates a location of a hand of the identified subject relative to a body of the identified subject, and the generated classifications include a third nearness classification indicating a location of a hand of the identified subject relative to a basket associated with an identified subject. 3. The system of claim 1 , including logic to perform time sequence analysis over the classifications of images to detect said takes and said puts by the identified subjects. 4. The system of claim 1 , wherein the logic to process sets of images includes: for identified subjects, logic to identify bounding boxes of data representing hands in images in the sets of images of the identified subjects, and to process data in the bounding boxes to generate classifications of data within the bounding boxes for the identified subjects. 5. The system of claim 3 , wherein the classifications include whether the identified subject is holding an inventory item, a first nearness classification indicating a location of a hand of the identified subject relative to a shelf, a second nearness classification indicating a location a hand of the identified subject relative to a body of the identified subject, a third nearness classification indicating a location a hand of the identified subject relative to a basket associated with an identified subject, and an identifier of a likely inventory item. 6. The system of claim 3 , including logic to perform time sequence analysis over the classifications of data within the bounding boxes in the sets of images to detect said takes and said puts by the identified subjects. 7. The system of claim 1 , wherein the logic to process sets of images comprises convolutional neural networks. 8. The system of claim 1 , wherein cameras in the plurality of cameras are configured to generate synchronized sequences of images. 9. The system of claim 1 , wherein the plurality of cameras comprise cameras disposed over and having fields of view encompassing respective parts of the area in real space. 10. The system of claim 1 , including logic responsive to the detected takes and puts, to generate log data structures including a list of inventory items for identified subjects. 11. A method for tracking puts and takes of inventory items by subjects in an area of real space, the method including: using a plurality of cameras to produce respective sequences of images of corresponding fields of view in the real space, the field of view of each camera overlapping with the field of view of at least one other camera in the plurality of cameras; receiving corresponding sequences of images from the plurality of cameras, processing the images in the corresponding sequences using image recognition engines in a plurality of image recognition engines and identifying subjects represented in the images wherein the plurality of image recognition engines are part of a processing system coupled to the plurality of cameras; and processing sets of images in the sequences of images that include the identified subjects to detect takes of inventory items by identified subjects and puts of inventory items on shelves by identified subjects, wherein the processing sets of images includes: for identified subjects, generating classifications of the images of the identified subjects, the classifications including whether the identified subject is holding an inventory item, a first nearness classification indicating a location of a hand of the identified subject relative to a shelf, a second nearness classification indicating a location of a hand of the identified subject relative to a body of the identified subject, a third nearness classification indicating a location a hand of the identified subject relative to a basket associated with an identified subject, and an identifier of a likely inventory item. 12. The method of claim 11 , including performing time sequence analysis over the classifications of images to detect said takes and said puts by the identified subjects. 13. The method of claim 11 , wherein the processing sets of images includes: for identified subjects, identifying bounding boxes of data representing hands in images in the sets of images of the identified subjects, and processing data in the bounding boxes to generate classifications of data within the bounding boxes for the identified subjects. 14. The method of claim 13 , including performing time sequence analysis over the classifications of data within the bounding boxes in the sets of images to detect said takes and said puts by the identified subjects. 15. The method of claim 11 , including circular buffers coupled to cameras in the plurality of cameras to store sets of images in the sequences of images from the plurality of cameras. 16. The method of claim 11 , including processing sets of images using convolutional neural networks. 17. The method of claim 11 , wherein cameras in the plurality of cameras are configured to generate synchronized sequences of images. 18. The method of claim 11 , wherein the plurality of cameras comprise cameras disposed over and having fields of view encompassing respective parts of the area in real space. 19. The method of claim 11 , including responsive to the detected takes and puts, generating a log data structure including a list of inventory items for each identified subject. 20. A system for tracking puts and takes of inventory items by subjects in an area of real space, comprising: a plurality of cameras, cameras in the plurality of cameras producing respective sequences of images of corresponding fields of view in the real space, the field of view of each camera overlapping with the field of view of at least one other camera in the plurality of cameras; a processing system coupled to the plurality of cameras, the processing system including: first image recognition engines, receiving the sequences of images from the plurality of cameras, which process images to generate first data sets that

Assignees

Inventors

Classifications

  • using neural networks · CPC title

  • using classification, e.g. of video objects · CPC title

  • Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items (segmenting video sequences G06V20/49) · CPC title

  • H04N17/002Primary

    for television cameras · CPC title

  • Control of cameras or camera modules · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10133933B1 cover?
Systems and techniques are provided for tracking puts and takes of inventory items by subjects in an area of real space. A plurality of cameras with overlapping fields of view produce respective sequences of images of corresponding fields of view in the real space. A processing system is coupled to the system. In one embodiment, the processing system comprises image recognition engines receivin…
Who is the assignee on this patent?
Standard Cognition, Standard Cognition Corp
What technology area does this patent fall under?
Primary CPC classification H04N17/002. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Nov 20 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).