Multi-video annotation

US12367673B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-12367673-B1
Application numberUS-202217867448-A
CountryUS
Kind codeB1
Filing dateJul 18, 2022
Priority dateMar 30, 2017
Publication dateJul 22, 2025
Grant dateJul 22, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Multiple video files that are captured by calibrated imaging devices may be annotated based on a single annotation of an image frame of one of the video files. An operator may enter an annotation to an image frame via a user interface, and the annotation may be replicated from the image frame to other image frames that were captured at the same time and are included in other video files. Annotations may be updated by the operator and/or tracked in subsequent image frames. Predicted locations of the annotations in subsequent image frames within each of the video files may be determined, e.g., by a tracker, and a confidence level associated with any of the annotations may be calculated. Where the confidence level falls below a predetermined threshold, the operator may be prompted to delete or update the annotation, or the annotation may be deleted.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: a first camera including at least a portion of a scene within a first field of view, wherein the portion of the scene includes at least one object; a second camera including at least the portion of the scene within a second field of view, wherein the first camera and the second camera are calibrated with respect to one another; and a computer system in communication with the first camera and the second camera, wherein the first field of view and the second field of view overlap at least in part, and wherein the computer system comprises a computer display, at least one data store and at least one computer processor configured to at least: identify a first image captured by the first camera at a first time; determine a detection of a first item within a first portion of the first image; calculate a first confidence score for the detection of the first item within the first portion of the first image; determine that the first confidence score exceeds a predetermined threshold; in response to determining that the first confidence score exceeds the predetermined threshold, generate a first annotation of the first portion of the first image based at least in part on the detection of the first item within the first portion of the first image; identify a second image captured by the second camera at the first time; determine a detection of the first item within a second portion of the second image, wherein the second portion of the second image corresponds to the first portion of the first image; calculate a second confidence score for the detection of the first item within the second portion of the second image; determine that the second confidence score does not exceed the predetermined threshold; in response to determining that the second confidence score does not exceed the predetermined threshold, generate a second annotation of the second portion of the second image based at least in part on the first annotation; and store information regarding the second annotation in association with the second image in at least one data store, wherein the information regarding the second annotation identifies the first item and a location associated with the second portion of the second image. 2. The system of claim 1 , wherein the at least one computer processor is further configured to at least: cause a display of at least the first image in a user interface on at least one computer display; and receive a designation of the first portion of the first image via the user interface or the at least one computer display, wherein the designation of the first portion of the first image is a gesture with at least one of the user interface or a portion of the at least one computer display by a human operator, and wherein the gesture identifies a set of pixels within the first portion of the first image. 3. A system comprising: a first camera including at least a portion of a scene within a first field of view, wherein the portion of the scene includes at least one object; a second camera including at least the portion of the scene within a second field of view, wherein the first camera and the second camera are calibrated with respect to one another; and a computer system in communication with the first camera and the second camera, wherein the first field of view and the second field of view overlap at least in part, and wherein the computer system comprises a computer display, at least one data store, and at least one computer processor configured to at least: identify a first image captured by the first camera at a first time; detect a first item within a first portion of the first image: generate a first annotation of the first portion of the first image; identify a second image captured by the second camera at the first time; identify a second portion of the second image, wherein the second portion of the second image corresponds to the first portion of the first image; define a bounding region based at least in part on: a first ray extending from a position of a first image sensor of the first camera in three-dimensional space at approximately the first time through a location corresponding to a first portion of the first annotation within an image plane of the first camera at approximately the first time; and a second ray from the position of the first image sensor of the first camera in three-dimensional space at approximately the first time through a location corresponding to a second portion of the first annotation within the image plane of the first camera at approximately the first time; project the bounding region into an image plane of the second camera at approximately the first time; generate a second annotation of the second portion of the second image based at least in part on the first annotation and the bounding region projected into the image plane of the second camera at approximately the first time; and store information regarding the second annotation in association with the second image in at least one data store, wherein the information regarding the second annotation identifies the first item and a location associated with the second portion of the second image. 4. A computer-implemented method comprising: receiving a designation of a first portion of a first image captured by one of a first camera or a second camera at a first time, wherein each of the first camera and the second camera is calibrated with respect to one another; defining a first annotation of the first image based at least in part on the designation of the first portion of the first image; generating a second annotation of a second portion of a second image captured by the first camera at a second time based at least in part on the designation of the first portion of the first image, wherein the second time follows the first time; determining a second detection of an item in the second portion of the second image; generating a third annotation of a third portion of a third image captured by the second camera at the second time based at least in part on the second annotation; and storing information regarding the third annotation in association with the third image in at least one data store. 5. The computer-implemented method of claim 4 , further comprising: causing a display of at least the first image in a user interface on at least one computer display of a computer device, wherein the designation of the first portion of the first image is a gesture with at least one of the user interface or a portion of the at least one computer display by a human operator, and wherein the gesture identifies a set of pixels within the first portion of the first image. 6. A computer-implemented method comprising: generating a first annotation of a first portion of a first image captured by a first camera at a first time; determining a first detection of an item in the first portion of the first image; determining that a first confidence level in the first detection exceeds a predetermined threshold; determining a second detection of the item in the second portion of the second image; determining that a second confidence level in the second detection does not exceed the predetermined threshold; and in response to determining that the second confidence level in the second detection does not exceed the predetermined threshold, generating a second annotation of the second portion of the second image based at least in part on the second annotation; and storing information regarding the second annotation in association with the second image in at least one data store. 7. The computer-implemented method of claim 4 , wherein the information regarding the second annotation comprises: an identifie

Assignees

Inventors

Classifications

  • References adjustable by an adaptive method, e.g. learning · CPC title

  • Transmitting camera control signals through networks, e.g. control via the Internet · CPC title

  • by using electronic viewfinders · CPC title

  • Control of parameters via user interfaces · CPC title

  • for generating image signals from two or more image sensors being of different type or operating in different modes, e.g. with a CMOS sensor for moving images in combination with a charge-coupled device [CCD] for still images · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12367673B1 cover?
Multiple video files that are captured by calibrated imaging devices may be annotated based on a single annotation of an image frame of one of the video files. An operator may enter an annotation to an image frame via a user interface, and the annotation may be replicated from the image frame to other image frames that were captured at the same time and are included in other video files. Annota…
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06V20/41. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 22 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).