Automated recalibration of sensors for autonomous checkout

US12079769B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12079769-B2
Application numberUS-202217741110-A
CountryUS
Kind codeB2
Filing dateMay 10, 2022
Priority dateJun 26, 2020
Publication dateSep 3, 2024
Grant dateSep 3, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Automated techniques provide for recalibrating cameras in a real space in which puts and takes of items are tracked. The method includes first processing one or more selected images selected from a plurality of sequences of images received from a plurality of cameras calibrated using a set of calibration images that were used to calibrate the cameras previously. The first processing includes a process step to match one or more features from the selected images with features extracted from the set of calibration images using a trained neural network classifier. The features correspond to points located at displays or structures that remain substantially immobile. Camera calibrations can be updated when transform information between features matched meets or exceeds a threshold.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for recalibrating cameras in a real space for tracking puts and takes of items by subjects, the method including: first processing one or more images selected from a plurality of sequences of images received from a plurality of cameras, in which selected images in the plurality of sequences of images have respective fields of view in the real space, to: match one or more features corresponding to points located at displays or relatively immobile structures extracted from the selected images using a trained neural network classifier with features from a set of calibration images; obtain based upon features as matched, transformation information between the selected images and the set of calibration images; and update calibration of a camera with the transformation information whenever the transformation information for the camera meets or exceeds a first threshold. 2. The method of claim 1 , wherein the trained neural network classifier has been trained using a synthetic shapes dataset created by a second neural network. 3. The method of claim 2 , wherein the second neural network has been trained using a plurality of synthetic shapes having no ambiguity in interest point locations, wherein the synthetic shapes comprise three-dimensional models created automatically, and a plurality of viewpoints generated for the three-dimensional models for matching features; and wherein three-dimensional models are finetuned by data collected from like real space environments having matching features annotated between different images captured from different viewpoints. 4. The method of claim 1 , wherein feature descriptors corresponding to points located at displays or structures that remain substantially immobile are extracted using a scale invariant feature transform. 5. The method of claim 1 , further including second processing sequences of images of the plurality of sequences of images, to track puts and takes of items by subjects within respective fields of view in the real space; and wherein first processing and second processing occur substantially contemporaneously, thereby enabling cameras to be calibrated without clearing subjects from the real space or interrupting tracking puts and takes of items by subjects. 6. The method of claim 5 , wherein second processing at least one sequence of images of the plurality of sequences of images to track a take or put event, further includes, detecting the take or put event using a trained neural network. 7. The method of claim 6 , wherein second processing to track puts and takes of items by subjects includes tracking inventory caches involved in an exchange that move over time having locations in three dimensions. 8. The method of claim 7 , wherein locations of the inventory caches include locations corresponding to hands of identified subjects, and wherein processing the plurality of sequences of images includes using an image recognition engine to detect an inventory item in hands of a subject identified in the exchange as detected. 9. The method of claim 5 , wherein second processing at least one sequence of images of the plurality of sequences of images to track a take or put event, further including, detecting the take or put event using a trained random forest. 10. The method of claim 1 , further including storing the transformation information and images used to calibrate the cameras in a database. 11. The method of claim 1 , wherein the transformation information is determined relative to an origin point that is selected as a reference point for calibration. 12. The method of claim 1 , wherein updating calibration of a camera with the transformation information further includes updating calibration of a camera with the transformation information whenever the transformation information obtained for the camera meets or exceeds a second threshold of at least a 1 centimeter change in camera translation value. 13. The method of claim 1 , wherein updating calibration of a camera with the transformation information further includes updating calibration of a camera with the transformation information whenever the transformation information obtained for the camera meets or exceeds a third threshold of at least a 1 degree change in camera rotation value. 14. A system including one or more processors and memory accessible by the processors, the memory loaded with computer instructions recalibrating cameras in a real space for tracking puts and takes of items by subjects between inventory caches which can act as at least one of sources and sinks of inventory items in exchanges of inventory items, which computer instructions, when executed on the processors, implement actions comprising: first processing one or more images selected from a plurality of sequences of images received from a plurality of cameras, in which selected images in the plurality of sequences of images have respective fields of view in the real space, to: match one or more features corresponding to points located at displays or relatively immobile structures extracted from the selected images using a trained neural network classifier with features from a set of calibration images; obtain based upon features as matched, transformation information between the selected images and the set of calibration images; and update calibration of a camera with the transformation information whenever the transformation information for the camera meets or exceeds a threshold. 15. The system of claim 14 , wherein the trained neural network classifier has been trained using a synthetic shapes dataset created by a second neural network. 16. The system of claim 15 , wherein the second neural network has been trained using a plurality of synthetic shapes having no ambiguity in interest point locations, wherein the synthetic shapes comprise three-dimensional models created automatically, and a plurality of viewpoints generated for the three-dimensional models for matching features; and wherein three-dimensional models are finetuned by data collected from like real space environments having matching features annotated between different images captured from different viewpoints. 17. The system of claim 14 , wherein feature descriptors corresponding to points located at displays or structures that remain substantially immobile are extracted using a scale invariant feature transform. 18. The system of claim 14 , further including second processing sequences of images of the plurality of sequences of images, to track puts and takes of items by subjects within respective fields of view in the real space; and wherein first processing and second processing occur substantially contemporaneously, thereby enabling cameras to be calibrated without clearing subjects from the real space or interrupting tracking puts and takes of items by subjects. 19. The system of claim 18 , wherein second processing at least one sequence of images of the plurality of sequences of images to track a take or put event, further includes, detecting the take or put event using a trained neural network. 20. The system of claim 19 , wherein second processing to track puts and takes of items by subjects includes tracking inventory caches involved in an exchange which move over time having locations in three dimensions. 21. The system of claim 20 , wherein locations of the inventory caches include locations corresponding to hands of identified subjects, and wherein processing the plurality of sequences of images includ

Assignees

Inventors

Classifications

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Supervised learning · CPC title

  • Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title

  • using neural networks · CPC title

  • Image or video pattern matching; Proximity measures in feature spaces · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12079769B2 cover?
Automated techniques provide for recalibrating cameras in a real space in which puts and takes of items are tracked. The method includes first processing one or more selected images selected from a plurality of sequences of images received from a plurality of cameras calibrated using a set of calibration images that were used to calibrate the cameras previously. The first processing includes a …
Who is the assignee on this patent?
Standard Cognition Corp
What technology area does this patent fall under?
Primary CPC classification G06Q10/087. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 03 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).