Audio metadata smoothing

US11416208B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11416208-B2
Application numberUS-202015931442-A
CountryUS
Kind codeB2
Filing dateMay 13, 2020
Priority dateSep 23, 2019
Publication dateAug 16, 2022
Grant dateAug 16, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The disclosed computer-implemented method for smoothing audio gaps using adaptive metadata identifies an initial audio segment and a subsequent audio segment that follows the initial audio segment. The method accesses a first set of metadata that corresponds to a last audio frame of the initial audio segment and accesses a second set of metadata that corresponds to the first audio frame of the subsequent audio segment. The first and second sets of metadata include audio characteristic information for the two audio segments. The method then generates a new set of metadata that is based on both sets of audio characteristics. The method further inserts a new audio frame between the last audio frame of the initial audio segment and the first audio frame of the subsequent audio segment and applies the new set of metadata to the new audio frame. Various other methods, systems, and computer-readable media are also disclosed.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: identifying, within at least one media item that includes a plurality of audio segments, an initial audio segment and a subsequent audio segment that follows the initial audio segment; accessing a first set of metadata that corresponds to a last audio frame of the initial audio segment, the first set of metadata including information indicating one or more audio characteristics of the last audio frame of the initial audio segment; accessing a second set of metadata that corresponds to the first audio frame of the subsequent audio segment, the second set of metadata including information indicating one or more audio characteristics of the first audio frame of the subsequent audio segment; generating, based on the first and second sets of metadata, a new set of metadata that is based on both the audio characteristics of the last audio frame in the initial audio segment and the audio characteristics of the first audio frame in the subsequent audio segment; detecting a gap length in time between playback of the initial audio segment and playback of the subsequent audio segment; inserting at least one new audio frame between the last audio frame of the initial audio segment and the first audio frame of the subsequent audio segment, wherein the first set of metadata is accessed from header information in audio frames of the initial audio segment, and wherein the inserted audio frames are inserted into the detected gap until subsequent header information from audio frames in the subsequent audio segment is accessed to determine the audio characteristics of the subsequent audio segment; and applying the new set of metadata to the at least one new audio frame. 2. The computer-implemented method of claim 1 , wherein the initial audio segment and the subsequent audio segment are part of the same media item. 3. The computer-implemented method of claim 2 , wherein the media item comprises an interactive media item that allows out-of-order playback of audio segments. 4. The computer-implemented method of claim 3 , wherein the subsequent audio segment comprises an out-of-order audio segment within the media item. 5. The computer-implemented method of claim 1 , wherein the initial audio segment and the subsequent audio segment are each part of different media items that are being spliced together. 6. The computer-implemented method of claim 1 , wherein the generated new set of metadata comprises adaptive metadata configured to adapt to the audio characteristics of the last audio frame in the initial audio segment and to the audio characteristics of the first audio frame in the subsequent audio segment. 7. The computer-implemented method of claim 6 , wherein the new audio frame includes at least two sub-portions over which the audio characteristics of the last audio frame in the initial audio segment are transitioned to the audio characteristics of the first audio frame in the subsequent audio segment using the adaptive metadata. 8. The computer-implemented method of claim 6 , wherein the at least one new audio frame comprises at least two new audio frames over which the audio characteristics of the last audio frame in the initial audio segment are transitioned to the audio characteristics of the first audio frame in the subsequent audio segment using the adaptive metadata. 9. The computer-implemented method of claim 6 , wherein the adaptive metadata is dynamically inserted into a string of inserted audio frames until the first audio frame of the subsequent audio segment is reached. 10. The computer-implemented method of claim 9 , wherein the number of inserted audio frames having adaptive metadata depends on a length of time between playback of the last audio frame in the initial audio segment and the first audio frame in the subsequent audio segment. 11. The computer-implemented method of claim 6 , wherein the at least one new audio frame is generated by: processing audio stream coding information (ASCI) from known good ASCI into a stored, silent audio frame; passing audio frame coding information (AFCI) metadata into the stored, silent audio frame; inserting audio block coding information (ABCI) metadata into the stored audio frame; padding a zero value into the audio frames to match a frame size determined by a corresponding audio stream bitrate; and generating audio error detection or correction codes. 12. A system comprising: at least one physical processor; and physical memory comprising computer-executable instructions that, when executed by the physical processor, cause the physical processor to: identify, within at least one media item that includes a plurality of audio segments, an initial audio segment and a subsequent audio segment that follows the initial audio segment; access a first set of metadata that corresponds to a last audio frame of the initial audio segment, the first set of metadata including information indicating one or more audio characteristics of the last audio frame of the initial audio segment; access a second set of metadata that corresponds to the first audio frame of the subsequent audio segment, the second set of metadata including information indicating one or more audio characteristics of the first audio frame of the subsequent audio segment; generate, based on the first and second sets of metadata, a new set of metadata that is based on both the audio characteristics of the last audio frame in the initial audio segment and the audio characteristics of the first audio frame in the subsequent audio segment; detect a gap length in time between playback of the initial audio segment and playback of the subsequent audio segment; insert at least one new audio frame between the last audio frame of the initial audio segment and the first audio frame of the subsequent audio segment, wherein the first set of metadata is accessed from header information in audio frames of the initial audio segment, and wherein the inserted audio frames are inserted into the detected gap until subsequent header information from audio frames in the subsequent audio segment is accessed to determine the audio characteristics of the subsequent audio segment; and apply the new set of metadata to the at least one new audio frame. 13. The system of claim 12 , wherein the initial audio segment and the subsequent audio segment are inserted into a pass-through device. 14. The system of claim 13 , wherein the insertion into a pass-through device includes: copying the first set of metadata into a silent audio frame; inserting the silent audio frame after the last audio frame of the initial audio segment; copying the first set of metadata into a pre-encoded user interface audio segment having one or more audio frames; inserting the pre-encoded user interface audio segment; inserting the silent audio frame after the inserted pre-encoded user interface audio segment; and removing a specified number of audio frames from the subsequent audio segment to maintain audio/video synchronization. 15. The system of claim 12 , further comprising: detecting that playback of the initial audio segment or the subsequent audio segment has been directed to stop; halting playback of the initial audio segment or the subsequent audio segment at a specified position, the initial audio segment or the subsequent audio segment having a current sound pressure level; appending one or more audio frames to the initial audio segment or the subsequent audio segment after the specified position, wherein the appended audio frames include adaptive metadata that gradually redu

Assignees

Inventors

Classifications

  • involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements {(video transcoding H04N19/40; media packet handling at the source H04L65/762)} · CPC title

  • Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers · CPC title

  • G06F3/165Primary

    Management of the audio stream, e.g. setting of volume, audio stream path · CPC title

  • by decomposing the content in the time domain, e.g. in time segments · CPC title

  • for synchronising with other signals, e.g. video signals · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11416208B2 cover?
The disclosed computer-implemented method for smoothing audio gaps using adaptive metadata identifies an initial audio segment and a subsequent audio segment that follows the initial audio segment. The method accesses a first set of metadata that corresponds to a last audio frame of the initial audio segment and accesses a second set of metadata that corresponds to the first audio frame of the …
Who is the assignee on this patent?
Netflix Inc
What technology area does this patent fall under?
Primary CPC classification G06F3/165. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 16 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).