Bitrate-adaptive segmentation for video transcoding

US11818345B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11818345-B2
Application numberUS-202217696760-A
CountryUS
Kind codeB2
Filing dateMar 16, 2022
Priority dateMar 16, 2022
Publication dateNov 14, 2023
Grant dateNov 14, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Bitrate-adaptive segmentation is performed for transcoding a video stream uploaded to an online video platform for hosting and later playback to platform users. The video stream is segmented into chunks based on prediction-based bit costs determined for frames of the video stream rather than based on scene changes detected within the video stream. The bitrate-adaptive segmentation includes determining inter-prediction bit costs and intra-prediction bit costs for frames of the video stream based on information indicated within a pass log based on a first pass encoding of the video stream, determining chunk boundaries for segmenting the video stream into a chunk based on the inter-prediction bit costs and the intra-prediction bit costs for the frames, and transcoding the chunk to produce a transcoded video stream.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: receiving a video stream uploaded to an online video platform; determining inter-prediction bit costs and intra-prediction bit costs for frames of the video stream based on information indicated within a pass log based on a first pass encoding of the video stream; determining chunk boundaries for segmenting the video stream into a chunk by evaluating, for each pair of the frames that meets one or more chunk size thresholds, the inter-prediction bit costs and the intra-prediction bit costs corresponding to the pair of the frames; and transcoding the chunk to produce a transcoded video stream. 2. The method of claim 1 , wherein the one or more chunk size thresholds includes a minimum chunk size threshold and a maximum chunk size threshold, wherein determining the chunk boundaries comprises: determining, for each pair of the frames that meets the minimum chunk size threshold and the maximum chunk size threshold, a weighted cost based on the intra-prediction bit cost for a first frame of the pair and a sum of the inter-prediction bit costs for all frames between the first frame and a second frame of the pair; and selecting, as the chunk boundaries, the pair of the frames corresponding to a lowest one of the weighted costs. 3. The method of claim 2 , wherein the frames are identified within a sliding window, the method comprising: updating, based on the selection of the pair of the frames as the chunk boundaries, the sliding window to start at the second frame. 4. The method of claim 2 , the method comprising: storing, prior to the determining of the inter-prediction bit costs and the intra-prediction bit costs, the frames in a lookahead buffer having a size greater than the maximum chunk size threshold; and determining, from amongst the frames stored in the lookahead buffer, the pairs of the frames that meet the minimum chunk size threshold and the maximum chunk size threshold. 5. The method of claim 1 , wherein, for an intra-predicted frame of the frames, the pass log identifies an intra-prediction bit cost, and determining an inter-prediction bit cost for the intra-predicted frame based on the information indicated within the pass log comprises: predicting the inter-prediction bit cost based on a mean value of inter-prediction bit costs for inter-predicted frames of the frames nearby the intra-predicted frame in a display order of the video stream. 6. The method of claim 1 , wherein, for an inter-predicted frame of the frames, the pass log identifies an inter-prediction bit cost, and determining an intra-prediction bit cost for the inter-predicted frame based on the information indicated within the pass log comprises: predicting the intra-prediction bit cost based on an intra-prediction bit cost identified by the pass log for intra-predicted frame of the frames nearest to the inter-predicted frame in a display order of the video stream. 7. An apparatus, comprising: a memory; and a processor configured to execute instructions stored in the memory to: determine prediction-based bit costs for frames of a video stream uploaded to an online video platform; determine chunk boundaries for a chunk of the video stream by evaluating, for each pair of the frames that meets one or more chunk size thresholds, the prediction-based bit costs corresponding to the pair of the frames; and segment the video stream into the chunk according to the chunk boundaries. 8. The apparatus of claim 7 , wherein, to determine the prediction-based bit costs for the frames of the video stream, the processor is configured to execute the instructions to: predict one or more of the prediction-based bit costs based on information indicated within a pass log for the video stream. 9. The apparatus of claim 8 , wherein, to predict the one or more bit costs based on the information indicated within the pass log for the video stream, the processor is configured to execute the instructions to: for each I-frame of the frames, predict an inter bit cost based on a mean value of inter bit costs for one or both of P-frames or B-frames proximate to the I-frame in a display order of the video stream; and for each P-frame or B-frame of the frames, predict an intra bit cost based on an intra bit cost for an I-frame nearest to the P-frame or the B-frame in the display order. 10. The apparatus of claim 8 , wherein the information indicated within the pass log corresponds to prediction residual error data, and wherein, to predict the one or more bit costs based on the information indicated within the pass log for the video stream, the processor is configured to execute the instructions to: infer a prediction-based bit cost based on the prediction residual error data. 11. The apparatus of claim 8 , wherein the prediction is performed at a block-level. 12. The apparatus of claim 8 , wherein the pass log is an encoder pass log or a transcoder mezzanine log. 13. The apparatus of claim 7 , wherein, to determine chunk boundaries for a chunk of the video stream, the processor is configured to execute the instructions to: determine, as chunk boundary candidates, edges for the pairs of the frames; determine weighted costs for the edges based on the prediction-based bit costs that are associated with the chunk boundary candidates; and select, as the chunk boundaries, the chunk boundary candidates having an edge corresponding to a lowest one of the weighted costs. 14. The apparatus of claim 13 , wherein, to determine the edges for the pairs of the frames, the processor is configured to execute the instructions to: determine an edge between a pair of the frames based on the pair of the frames meeting the one or more chunk size thresholds. 15. The apparatus of claim 13 , wherein the processor is configured to execute the instructions to: identify, as the chunk boundary candidates, ones of the frames that are stored in a lookahead buffer. 16. The apparatus of claim 13 , wherein the processor is configured to execute the instructions to: identify, as a chunk boundary candidate, a frame of the frames for which a prediction residual error resulting from an intra-prediction of the frame divided by a prediction residual error resulting from an inter-prediction of the frame is less than a threshold. 17. A non-transitory computer readable storage device including program instructions that, when executed by a processor, cause the processor to perform operations, the operations comprising: determining chunk boundaries for segmenting a video stream into a chunk by evaluating, for each pair of frames of the video stream that meets one or more chunk size thresholds, prediction-based bit costs corresponding to the pair of the frames; segmenting the video stream into the chunk; and transcoding the chunk to produce a transcoded video stream. 18. The non-transitory computer readable storage device of claim 17 , wherein the frames are a subset of frames of the video stream, the operations comprising: storing the frames within a lookahead buffer; and determining, for each frame stored within the lookahead buffer that is identified as a chunk boundary candidate, edges with other frames identified as chunk boundary candidates based on the one or more chunk size thresholds. 19. The non-transitory computer readable storage device of claim 18 , wherein the operations for determining the chunk boundaries comprise: determining weighted costs for the edges based on the prediction-based costs; and selecting, as the chunk boundaries, the

Assignees

Inventors

Classifications

  • H04N19/119Primary

    Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks · CPC title

  • Data rate or code amount at the encoder output · CPC title

  • Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction · CPC title

  • the unit being a scene or a shot · CPC title

  • the adaptation method, adaptation tool or adaptation type being iterative or recursive · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11818345B2 cover?
Bitrate-adaptive segmentation is performed for transcoding a video stream uploaded to an online video platform for hosting and later playback to platform users. The video stream is segmented into chunks based on prediction-based bit costs determined for frames of the video stream rather than based on scene changes detected within the video stream. The bitrate-adaptive segmentation includes dete…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification H04N19/119. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Nov 14 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).