Video Background Substraction Using Depth
US-2021019892-A1 · Jan 21, 2021 · US
US11069036B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-11069036-B1 |
| Application number | US-202016733596-A |
| Country | US |
| Kind code | B1 |
| Filing date | Jan 3, 2020 |
| Priority date | Jan 3, 2020 |
| Publication date | Jul 20, 2021 |
| Grant date | Jul 20, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and techniques that facilitate real-time and/or offline de-identification of facial regions from regular and/or occluded color video streams obtained during diagnostic medical procedures are provided. A detection component can generate a bounding box substantially around a person in a frame of a video stream, can generate a heatmap showing key points or anatomical masks of the person based on the bounding box, and can localize a face or facial region of the person based on the key points or anatomical masks. An anonymization component can anonymize pixels in the frame that correspond to the face or facial region. A tracking component can track the face or facial region in a subsequent frame based on a structural similarity index between the frame and the subsequent frame being above a threshold. If the structural similarity index between the frame and the subsequent frame is above the threshold, the tracking component can track the face or facial region in the subsequent frame without having the detection component generate a bounding box or a heatmap in the subsequent frame, and the anonymization component can anonymize pixels in the subsequent frame corresponding to the tracked face or facial region.
Opening claim text (preview).
What is claimed is: 1. A system, comprising: a memory that stores computer-executable components; and a processor, operably coupled to the memory, that executes the computer-executable components stored in the memory, wherein the computer-executable components comprise: a detection component that generates a bounding box substantially around a person in a frame of a video stream, generates a heatmap showing key points or anatomical masks of the person based on the bounding box, and localizes a face or facial region of the person based on the key points or anatomical masks; an anonymization component that anonymizes pixels in the frame that correspond to the face or facial region; and a tracking component that tracks the face or facial region in a subsequent frame based on a structural similarity index between the frame and the subsequent frame being above a threshold. 2. The system of claim 1 , wherein: if the structural similarity index between the frame and the subsequent frame is above the threshold, the tracking component tracks the face or facial region in the subsequent frame, the detection component does not generate a bounding box or a heatmap in the subsequent frame, and the anonymization component anonymizes pixels in the subsequent frame corresponding to the face or facial region. 3. The system of claim 2 , wherein the threshold is 80% and a frame rate of the video stream is 30 frames per second. 4. The system of claim 1 , wherein: the detection component employs a first machine learning algorithm to generate the bounding box; the detection component employs a second machine learning algorithm to generate the heatmap and to localize the face or facial region; and the tracking component employs a third machine learning algorithm to track the face or facial region. 5. The system of claim 4 , wherein: the first machine learning algorithm comprises a trained YOLOv3 object detection algorithm; the second machine learning algorithm comprises a trained Simple Pose ResNet algorithm; and the third machine learning algorithm comprises a trained median flow tracker. 6. The system of claim 1 , wherein the anonymization component anonymizes pixels via pixilation or gaussian blurring. 7. The system of claim 1 , wherein the detection component upscales the bounding box to ensure that a substantial portion of the person is within the bounding box. 8. A computer-implemented method, comprising: generating, by a device operatively coupled to a processor, a bounding box substantially around a person in a frame of a video stream; generating, by the device, a heatmap showing key points or anatomical masks of the person based on the bounding box; localizing, by the device, a face or facial region of the person based on the key points or anatomical masks; anonymizing, by the device, pixels in the frame that correspond to the face or facial region; and tracking, by the device, the face or facial region in a subsequent frame based on a structural similarity index between the frame and the subsequent frame being above a threshold. 9. The computer-implemented method of claim 8 , further comprising: tracking, by the device, the face or facial region in the subsequent frame without generating a heatmap in the subsequent frame, if the structural similarity index between the frame and the subsequent frame is above the threshold; and anonymizing, by the device, pixels in the subsequent frame corresponding to the face or facial region. 10. The computer-implemented method of claim 9 , wherein the threshold is 80% and a frame rate of the video stream is 30 frames per second. 11. The computer-implemented method of claim 8 , wherein: the generating the bounding box employs a first machine learning algorithm; the generating the heatmap and localizing the face or facial region employs a second machine learning algorithm; and the tracking the face or facial region employs a third machine learning algorithm. 12. The computer-implemented method of claim 11 , wherein: the first machine learning algorithm comprises a trained YOLOv3 object detection algorithm; the second machine learning algorithm comprises a trained Simple Pose ResNet algorithm; and the third machine learning algorithm comprises a trained median flow tracker. 13. The computer-implemented method of claim 8 , wherein the anonymizing pixels employs pixilation or gaussian blurring. 14. The computer-implemented method of claim 8 , further comprising: upscaling, by the device, the bounding box to ensure that a substantial portion of the person is within the bounding box. 15. A computer program product for facilitating automated face or facial region anonymization in video streams, the computer program product comprising a computer readable memory having program instructions embodied therewith, the program instructions executable by a processing component to cause the processing component to: generate a bounding box substantially around a person in a frame of a video stream; generate a heatmap showing key points or anatomical masks of the person based on the bounding box; localize a face or facial region of the person based on the key points or anatomical masks; anonymize pixels in the frame that correspond to the face or facial region; and track the face or facial region in a subsequent frame based on a structural similarity index between the frame and the subsequent frame being above a threshold. 16. The computer program product of claim 15 , wherein the program instructions are further executable to cause the processing component to: track the face or facial region in the subsequent frame without generating a heatmap in the subsequent frame, if the structural similarity index between the frame and the subsequent frame is above the threshold; and anonymize pixels in the subsequent frame corresponding to the face or facial region. 17. The computer program product of claim 16 , wherein the threshold is 80% and a frame rate of the video stream is 30 frames per second. 18. The computer program product of claim 15 , wherein: the processing component generates the bounding box via a first machine learning algorithm; the processing component generates the heatmap and localizes the face or facial region via a second machine learning algorithm; and the processing component tracks the face or facial region via a third machine learning algorithm. 19. The computer program product of claim 18 , wherein: the first machine learning algorithm comprises a trained YOLOv3 object detection algorithm; the second machine learning algorithm comprises a trained Simple Pose ResNet algorithm; and the third machine learning algorithm comprises a trained median flow tracker. 20. The computer program product of claim 15 , wherein the program instructions are further executable to cause the processing component to: upscale the bounding box to ensure that a substantial portion of the person is within the bounding box.
using facial parts and geometric relationships · CPC title
Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods · CPC title
for processing medical images, e.g. editing · CPC title
Combinations of networks · CPC title
Supervised learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.