Subject identification and tracking using image recognition
US-10055853-B1 · Aug 21, 2018 · US
US11587243B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11587243-B2 |
| Application number | US-202117204788-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 17, 2021 |
| Priority date | Oct 25, 2019 |
| Publication date | Feb 21, 2023 |
| Grant date | Feb 21, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A tracking system includes a camera subsystem that includes cameras that capture vide of a space. Each camera is coupled with a camera client that determines local coordinates of people in the captured video. The camera clients generate frames that include color frames and depth frames labeled with an identifier number of the camera and their corresponding timestamps. The camera clients generate tracks that include metadata describing historical people detections, tracking identifications, timestamps, and the identifier number of the camera. The camera clients send the frames and tracks to cluster servers that maintain the frames and tracks such that they are retrievable using their corresponding labels. A camera server queries the cluster servers to receive the frames and tracks using their corresponding labels. The camera server determines the physical positions of people in the space based on the determined local coordinates.
Opening claim text (preview).
What is claimed is: 1. A system comprising: an array of cameras positioned above a space, wherein: each camera of the array of cameras is operatively coupled with a respective camera client from an array of camera clients; and each camera of the array of cameras is configured to capture a video of a portion of the space, the space containing a person; the array of camera clients operably coupled with the array of cameras; wherein: a first camera client of the array of camera clients is operably coupled with a first camera and configured to: receive a first plurality of frames of a first video from the first camera, wherein at least one frame of the first plurality of frames shows the person within the space; generate a timestamp when each frame from the first plurality of frames is received by the first camera client; for a first frame of the first plurality of frames, label the first frame with one or more first labels comprising an identifier number of the first camera and the timestamp associated with the first frame, such that the first frame is retrievable using the one or more first labels from a first server from among a plurality of cluster servers; send the first plurality of frames labeled with the one or more first labels to the first server; generate a first plurality of tracks by performing a local position tracking of the person, wherein at least one track of the first plurality of tracks indicates a location of the person in the at least one frame of the first plurality of frames; for a first track of the first plurality of tracks, label the first track with one or more second labels comprising the identifier number of the first camera, the timestamp associated with the first track, a particular historical detection associated with the person detected in the first track, and a particular tracking identification detected in the first track, such that the first track is retrievable using the one or more second labels from a second server from among the plurality of cluster servers; and send the first plurality of tracks labeled with the one or more second labels to the second server, wherein: the second server is separate from the first server; and the array of the camera clients is separate from the plurality of cluster servers. 2. The system of claim 1 , wherein: a second camera client of the array of camera clients is operably coupled with a second camera and separate from the first camera client, the second camera client configured to: receive a second plurality of frames of a second video from the second camera, wherein at least one frame of the second plurality of frames shows the person within the space; generate a timestamp when each frame of the second plurality of frames is received by the second camera client; send the second plurality of frames labeled with one or more corresponding timestamps and an identifier number of the second camera to the first server; generate a second plurality of tracks by performing a local position tracking of the person; wherein at least one track of the second plurality of tracks indicates a location of the person in the least one frame of the second plurality of frames; and send the second plurality of tracks labeled with one or more corresponding timestamps and an identifier number of the second camera to the second server. 3. The system of claim 2 , further comprising: the first server, operably coupled with at least one of the first camera client and the second camera client, the first server configured to: receive at least one frame from the first and second plurality of frames; and store the at least one frame such that the at least one frame is retrievable; the second server, operably coupled with at least one of the first camera client and the second camera client, the second server configured to: receive at least one track from the first and second plurality of tracks; and store the at least one track such the at least one track is retrievable. 4. The system of claim 1 , wherein: the first plurality of frames comprises a first plurality of color frames and a first plurality of depth frames; the first plurality of color frames corresponds to visual colors of objects in the space; and the first plurality of depth frames corresponds to distances of objects in the space from the first camera. 5. The system of claim 4 , wherein generating the first plurality of tracks comprises performing the local position tracking of the person in the first plurality of depth frames, and wherein: for a first depth frame of the first plurality of depth frames, generating a first track of the first plurality of tracks comprises: detecting a first contour associated with the person; determining, based on pixel coordinates of the first contour, a first bounding area around the person shown in the first depth frame; determining, based on the first bounding area, first coordinates of the person in the first depth frame; and associating a first tracking identification to the person, wherein: the first tracking identification is linked to historical detections associated with the person, and the historical detections associated with the person comprise at least one of a contour, a bounding area, and a segmentation mask associated with the person; for a second depth frame of the first plurality of depth frames, generating a second track of the first plurality of tracks comprises: detecting a second contour associated with the person; determining, based on pixel coordinates of the second contour, a second bounding area around the person shown in the second depth frame; determining, based on the second bounding area, second coordinates of the person in the second depth frame; determining whether the second bounding area corresponds to the first bounding area; and in response to determining that the second bounding area corresponds to the first bounding area, associating the first tracking identification to the person. 6. The system of claim 5 , wherein at least one track from the first plurality of tracks is further labeled with the historical detections associated with the person and the first tracking identification associated with the person. 7. A method comprising: receiving, by a first camera client, a first plurality of frames of a first video from a first camera, wherein at least one frame of the first plurality of frames shows a person within a space; generating, by the first camera client, a timestamp when each frame from the first plurality of frames is received from the first camera; for a first frame of the first plurality of frames, labeling the first frame with one or more first labels comprising an identifier number of the first camera and the timestamp associated with the first frame, such that the first frame is retrievable using the one or more first labels from a first server from among a plurality of cluster servers; sending, by the first camera client, the first plurality of frames labeled with the one or more first labels to the first server; generating, by the first camera client, a first plurality of tracks by performing a local position tracking of the person, wherein at least one track of the first plurality of tracks indicates a location of the person in the at least one frame of the first plurality of frames; for a first track of the first plurality of tracks, labeling the first track with one or more second labels comprising the identifier number of the first camera, the timestamp associated with the first track, a particular historical detection associated with the person detected in the first track, and a particular tracking identification detected in the first track, such that the first track is retrievable using the one or more second labels
Multi-camera tracking · CPC title
Determining position or orientation of objects or cameras (camera calibration G06T7/80) · CPC title
Market modelling; Market analysis; Collecting market data · CPC title
Combinations of lidar systems with systems other than lidar, radar or sonar, e.g. with direction finders · CPC title
Surveillance · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.