Content adaptive boundary placement for distributed encodes
US-2021076045-A1 · Mar 11, 2021 · US
US11818345B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11818345-B2 |
| Application number | US-202217696760-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 16, 2022 |
| Priority date | Mar 16, 2022 |
| Publication date | Nov 14, 2023 |
| Grant date | Nov 14, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Bitrate-adaptive segmentation is performed for transcoding a video stream uploaded to an online video platform for hosting and later playback to platform users. The video stream is segmented into chunks based on prediction-based bit costs determined for frames of the video stream rather than based on scene changes detected within the video stream. The bitrate-adaptive segmentation includes determining inter-prediction bit costs and intra-prediction bit costs for frames of the video stream based on information indicated within a pass log based on a first pass encoding of the video stream, determining chunk boundaries for segmenting the video stream into a chunk based on the inter-prediction bit costs and the intra-prediction bit costs for the frames, and transcoding the chunk to produce a transcoded video stream.
Opening claim text (preview).
What is claimed is: 1. A method, comprising: receiving a video stream uploaded to an online video platform; determining inter-prediction bit costs and intra-prediction bit costs for frames of the video stream based on information indicated within a pass log based on a first pass encoding of the video stream; determining chunk boundaries for segmenting the video stream into a chunk by evaluating, for each pair of the frames that meets one or more chunk size thresholds, the inter-prediction bit costs and the intra-prediction bit costs corresponding to the pair of the frames; and transcoding the chunk to produce a transcoded video stream. 2. The method of claim 1 , wherein the one or more chunk size thresholds includes a minimum chunk size threshold and a maximum chunk size threshold, wherein determining the chunk boundaries comprises: determining, for each pair of the frames that meets the minimum chunk size threshold and the maximum chunk size threshold, a weighted cost based on the intra-prediction bit cost for a first frame of the pair and a sum of the inter-prediction bit costs for all frames between the first frame and a second frame of the pair; and selecting, as the chunk boundaries, the pair of the frames corresponding to a lowest one of the weighted costs. 3. The method of claim 2 , wherein the frames are identified within a sliding window, the method comprising: updating, based on the selection of the pair of the frames as the chunk boundaries, the sliding window to start at the second frame. 4. The method of claim 2 , the method comprising: storing, prior to the determining of the inter-prediction bit costs and the intra-prediction bit costs, the frames in a lookahead buffer having a size greater than the maximum chunk size threshold; and determining, from amongst the frames stored in the lookahead buffer, the pairs of the frames that meet the minimum chunk size threshold and the maximum chunk size threshold. 5. The method of claim 1 , wherein, for an intra-predicted frame of the frames, the pass log identifies an intra-prediction bit cost, and determining an inter-prediction bit cost for the intra-predicted frame based on the information indicated within the pass log comprises: predicting the inter-prediction bit cost based on a mean value of inter-prediction bit costs for inter-predicted frames of the frames nearby the intra-predicted frame in a display order of the video stream. 6. The method of claim 1 , wherein, for an inter-predicted frame of the frames, the pass log identifies an inter-prediction bit cost, and determining an intra-prediction bit cost for the inter-predicted frame based on the information indicated within the pass log comprises: predicting the intra-prediction bit cost based on an intra-prediction bit cost identified by the pass log for intra-predicted frame of the frames nearest to the inter-predicted frame in a display order of the video stream. 7. An apparatus, comprising: a memory; and a processor configured to execute instructions stored in the memory to: determine prediction-based bit costs for frames of a video stream uploaded to an online video platform; determine chunk boundaries for a chunk of the video stream by evaluating, for each pair of the frames that meets one or more chunk size thresholds, the prediction-based bit costs corresponding to the pair of the frames; and segment the video stream into the chunk according to the chunk boundaries. 8. The apparatus of claim 7 , wherein, to determine the prediction-based bit costs for the frames of the video stream, the processor is configured to execute the instructions to: predict one or more of the prediction-based bit costs based on information indicated within a pass log for the video stream. 9. The apparatus of claim 8 , wherein, to predict the one or more bit costs based on the information indicated within the pass log for the video stream, the processor is configured to execute the instructions to: for each I-frame of the frames, predict an inter bit cost based on a mean value of inter bit costs for one or both of P-frames or B-frames proximate to the I-frame in a display order of the video stream; and for each P-frame or B-frame of the frames, predict an intra bit cost based on an intra bit cost for an I-frame nearest to the P-frame or the B-frame in the display order. 10. The apparatus of claim 8 , wherein the information indicated within the pass log corresponds to prediction residual error data, and wherein, to predict the one or more bit costs based on the information indicated within the pass log for the video stream, the processor is configured to execute the instructions to: infer a prediction-based bit cost based on the prediction residual error data. 11. The apparatus of claim 8 , wherein the prediction is performed at a block-level. 12. The apparatus of claim 8 , wherein the pass log is an encoder pass log or a transcoder mezzanine log. 13. The apparatus of claim 7 , wherein, to determine chunk boundaries for a chunk of the video stream, the processor is configured to execute the instructions to: determine, as chunk boundary candidates, edges for the pairs of the frames; determine weighted costs for the edges based on the prediction-based bit costs that are associated with the chunk boundary candidates; and select, as the chunk boundaries, the chunk boundary candidates having an edge corresponding to a lowest one of the weighted costs. 14. The apparatus of claim 13 , wherein, to determine the edges for the pairs of the frames, the processor is configured to execute the instructions to: determine an edge between a pair of the frames based on the pair of the frames meeting the one or more chunk size thresholds. 15. The apparatus of claim 13 , wherein the processor is configured to execute the instructions to: identify, as the chunk boundary candidates, ones of the frames that are stored in a lookahead buffer. 16. The apparatus of claim 13 , wherein the processor is configured to execute the instructions to: identify, as a chunk boundary candidate, a frame of the frames for which a prediction residual error resulting from an intra-prediction of the frame divided by a prediction residual error resulting from an inter-prediction of the frame is less than a threshold. 17. A non-transitory computer readable storage device including program instructions that, when executed by a processor, cause the processor to perform operations, the operations comprising: determining chunk boundaries for segmenting a video stream into a chunk by evaluating, for each pair of frames of the video stream that meets one or more chunk size thresholds, prediction-based bit costs corresponding to the pair of the frames; segmenting the video stream into the chunk; and transcoding the chunk to produce a transcoded video stream. 18. The non-transitory computer readable storage device of claim 17 , wherein the frames are a subset of frames of the video stream, the operations comprising: storing the frames within a lookahead buffer; and determining, for each frame stored within the lookahead buffer that is identified as a chunk boundary candidate, edges with other frames identified as chunk boundary candidates based on the one or more chunk size thresholds. 19. The non-transitory computer readable storage device of claim 18 , wherein the operations for determining the chunk boundaries comprise: determining weighted costs for the edges based on the prediction-based costs; and selecting, as the chunk boundaries, the
Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks · CPC title
Data rate or code amount at the encoder output · CPC title
Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction · CPC title
the unit being a scene or a shot · CPC title
the adaptation method, adaptation tool or adaptation type being iterative or recursive · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.