Controlling a pan-tilt-zoom camera
US-2021337158-A1 · Oct 28, 2021 · US
US12464141B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12464141-B2 |
| Application number | US-202318164069-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 3, 2023 |
| Priority date | Feb 17, 2022 |
| Publication date | Nov 4, 2025 |
| Grant date | Nov 4, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method of encoding a video stream including an overlay is provided, including: capturing a first image; adding an overlay to the first image at a first position, and encoding the first image in a first frame of a video stream; capturing a second image of the scene; determining a desired position of the overlay in the second image; encoding the second image in a second frame marked as a no-display frame, and generating and encoding a third frame including temporally predicted macroblocks at the desired position of the overlay referencing the first frame with motion vectors based on a difference between the desired position and the first position, and skip-macroblocks outside of the desired position of the overlay referencing the first frame. A corresponding device, computer program and computer program product are also provided.
Opening claim text (preview).
The invention claimed is: 1 . A method of encoding a video stream including an overlay, comprising: a) capturing a first image of a scene; b) adding an overlay to the first image at a first position, and encoding the first image as part of a first frame of an encoded video stream; c) capturing a second image of the scene; d) calculating a desired change in position of the overlay between a desired position of the overlay in the second image and the position of the overlay in the first image based on at least one of: i) information about a known change of a camera field-of-view between capturing the first image and the second image; ii) information about a known change of a camera position between capturing the first image and the second image, and iii) a known change in position of an object with which the overlay is associated in the scene between capturing the first image and the second image, using an object detection and/or tracking algorithm; e) encoding the second image as part of a second frame of the video stream, including marking the second frame as a no-display frame; and f) generating and encoding a third frame of the video stream, including one or more macroblocks of the third frame at the desired position of the overlay being temporally predicted macroblocks referencing the first frame, including one or more macroblocks of the third frame outside of the desired position of the overlay being skip-macroblocks referencing the second frame of the video stream, and including calculating motion vectors of the one or more temporally predicted macroblocks based on the desired change in position of the overlay. 2 . The method according to claim 1 , the third frame being a predicted frame, P-frame, or bi-directional predicted frame, B-frame, inserted after the second frame in the encoded video stream. 3 . The method according to claim 1 , the third frame being a bidirectional predicted frame, B-frame, inserted before the second frame in the encoded video stream. 4 . The method according to claim 1 , including capturing the first image and the second image using a same camera. 5 . The method according to claim 1 , the method being performed in a camera used to capture the first image and/or the second image. 6 . The method according to claim 1 , the overlay being fixed relative to the scene. 7 . The method according to claim 1 , further comprising estimating a computational time needed to render and encode the overlay as part of the second image and the second frame and, if determining that the estimated computational time is below a threshold value, performing steps a)-d) but not steps e) and f) and instead, after step d): e′) adding the overlay to the second image at the desired position, and encoding the second image as part of a second frame of the video stream. 8 . A device for encoding a video stream including an overlay, comprising: a processor, and a memory storing instructions that, when executed by the processor, cause the device to: capture a first image of a scene; add an overlay to the first image at a first position, and encode the first image as part of a first frame of an encoded video stream; capture a second image of the scene; calculate a desired change in position of the overlay between a desired position of the overlay in the second image and the position of the overlay in the first image based on at least one of: i) information about a known change of a camera field-of-view between capturing the first image and the second image; ii) information about a known change of a camera position between capturing the first image and the second image, and iii) a known change in position of an object with which the overlay is associated in the scene between capturing the first image and the second image, using an object detection and/or tracking algorithm; encode the second image as part of a second frame of the video stream, including to mark the second frame as a no-display frame; and generate and encode a third frame of the video stream, wherein one or more macroblocks of the third frame at the desired position of the overlay are temporally predicted macroblocks referencing the first frame, and wherein one or more macroblocks of the third frame outside of the desired position of the overlay are skip-macroblocks referencing the second frame of the video stream, including to calculate motion vectors of the one or more temporally predicted macroblocks based on the desired change in position of the overlay. 9 . The device according to claim 8 , wherein the device is a monitoring camera configured to capture at least one of the first image and the second image. 10 . A non-transitory computer readable storage medium having stored thereon computer program for encoding a video stream including an overlay, configured to, when executed by a processor of a device, cause the device to: capture a first image of a scene; add an overlay to the first image at a first position, and encode the first image as part of a first frame of an encoded video stream; capture a second image of the scene; calculate a desired change in position of the overlay between a desired position of the overlay in the second image and the position of the overlay in the first image based on at least one of: i) information about a known change of a camera field-of-view between capturing the first image and the second image; ii) information about a known change of a camera position between capturing the first image and the second image, and iii) a known change in position of an object with which the overlay is associated in the scene between capturing the first image and the second image, using an object detection and/or tracking algorithm; encode the second image as part of a second frame of the video stream, including to mark the second frame as a no-display frame; and generate and encode a third frame of the video stream, wherein one or more macroblocks of the third frame at the desired position of the overlay are temporally predicted macroblocks referencing the first frame, and wherein one or more macroblocks of the third frame outside of the desired position of the overlay are skip-macroblocks referencing the second frame of the video stream, including to calculate motion vectors of the one or more temporally predicted macroblocks based on the desired change in position of the overlay.
Motion estimation or motion compensation · CPC title
the unit being a scalable video layer · CPC title
the region being a block, e.g. a macroblock · CPC title
the region being a picture, frame or field · CPC title
Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.