Video content alignment

US9984728B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9984728-B2
Application numberUS-201614997351-A
CountryUS
Kind codeB2
Filing dateJan 15, 2016
Priority dateSep 26, 2014
Publication dateMay 29, 2018
Grant dateMay 29, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Various embodiments identify differences between frame sequences of a video. For example, to determine a difference between two versions of a video, a fingerprint of each frame of the two versions is generated. From the fingerprints, a run-length encoded representation of each version is generated. The fingerprints which appear only once (i.e., unique fingerprints) in the entire video are identified from each version and compared to identify matching unique fingerprints across versions. The matching unique fingerprints are sorted and filtered to determine split points, which are used to align the two versions of the video. Accordingly, each version is segmented into smaller frame sequences using the split points. Once segmented, the individual frames of each segment are aligned across versions using a dynamic programming algorithm. After aligning the segments at a frame level, the segments are reassembled to generate a global alignment output.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, comprising: receiving a first video file and a second video file, the first video file including a first plurality of frames and the second video file including a second plurality of frames, each frame of the first plurality of frames segmented into one or more cells, each frame of the second plurality of frames segmented into one or more cells; generating a first fingerprint for each frame of a first subset of frames of the first plurality of frames based at least in part on an intensity value associated with the one or more cells of each frame of the first subset of frames; generating a second fingerprint for each frame of a second subset of frames of the second plurality of frames based at least in part on an intensity value associated with the one or more cells of each frame of the second subset of frames; generating a first encoded representation of the first fingerprints for the first subset of frames and a second encoded representation of the second fingerprints for the second subset of frames; identifying a first set of unique fingerprints from the first encoded representation and a second set of unique fingerprints from the second encoded representation; comparing the first set of unique fingerprints to the second set of unique fingerprints; segmenting the first video file into a plurality of first segments and the second video file into a plurality of second segments based at least in part on comparing the first set of unique fingerprints to the second set of unique fingerprints; and aligning each first segment of the plurality of first segments with a corresponding second segment of the plurality of second segments; determining a set of split points, the split points being used to match unique fingerprints between the first video file and the second video file based at least in part on comparing the first set of unique fingerprints to the second set of unique fingerprints; sorting, by video file, the set of split points; aligning, using a dynamic programming algorithm, first split points of the set of split points from the first video file with second split points of the set of split points from the second video file; and removing split points of the set of split points that do not appear in an output generated by the dynamic programming algorithm. 2. The computer-implemented method of claim 1 , wherein generating the first fingerprint for each frame includes: computing the intensity value associated with the one or more cells of each of the first subset of frames, the intensity value being an average intensity value of each cell; and comparing the intensity value of a first cell against each of a subset of the other cells to generate the first fingerprint. 3. The computer-implemented method of claim 1 , wherein generating the second fingerprint for each frame includes: computing the intensity value associated with the one or more cells of each of the second subset of frames, the intensity value being an average intensity value of each cell; and comparing the intensity value of a second cell against each of a subset of the other cells to generate the second fingerprint. 4. The computer-implemented method of claim 1 , wherein the dynamic programming algorithm is a Gotoh's sequence alignment algorithm. 5. The computer-implemented method of claim 1 , further comprising: comparing a first time duration of a split point from the first video file to a second time duration of a corresponding split point of the second video file; determining that the first time duration of the split point from the first video file does not match the second time duration of the split point from the second video file; and removing the split point from the first video file and the split point from the second video file from the set of split points. 6. The computer-implemented method of claim 1 , wherein each split point of the set of split points corresponds to a sequence of frames of a particular scene in the video file. 7. The computer-implemented method of claim 1 , wherein determining the set of split points further comprises: identifying a plurality of matching unique fingerprints based at least on comparing the first set of unique fingerprints to the second set of unique fingerprints; sorting the plurality of matching unique fingerprints by time; and filtering the plurality of matching unique fingerprints using a Longest Common Subsequences (LCS) algorithm; and determining the set of split points based on the filtering. 8. The computer-implemented method of claim 1 , further comprising: obtaining frame annotations for a set of frames from the first video file; and propagating the frame annotations to a set of corresponding frames in the second video file based on aligning each first segment of the plurality of first segments with corresponding second segment of the plurality of second segments. 9. The computer-implemented method of claim 1 , wherein aligning each first segment of the plurality of first segments with corresponding second segment of the plurality of second segments includes: inserting at least one frame gap in a location of the first video file corresponding to at least one inserted frame in the second video file. 10. A computing system, comprising: a processor; and memory including instructions that, when executed by the processor, cause the computing system to: receive a first video file and a second video file, the first video file including a first plurality of frames and the second video file including a second plurality of frames, each frame of the first plurality of frames segmented into one or more cells, each frame of the second plurality of frames segmented into one or more cells; generate a first fingerprint for each frame of a first subset of frames of the first plurality of frames based at least in part on an intensity value associated with the one or more cells of each frame of the first subset of frames; generate a second fingerprint for each frame of a second subset of frames of the second plurality of frames based at least in part on an intensity value associated with the one or more cells of each frame of the second subset of frames; generate a first encoded representation of the first fingerprints for the first video file input and a second encoded representation of the second fingerprints for the second video file input; identify a first set of unique fingerprints from the first encoded representation and a second set of unique fingerprints from the second encoded representation; compare the first set of unique fingerprints to the second set of unique fingerprints; segment the first video file input into a plurality of first segments and the second video file input into a plurality of second segments based at least in part on comparing the first set of unique fingerprints to the second set of unique fingerprints; and align each first segment of the plurality of first segments with a corresponding second segment of the plurality of second segments, wherein generating the version fingerprint for each frame of the version of the video file further includes: segmenting each frame into a plurality of cells; and positioning a rectangular canonical window divided into the plurality of segments relative to a center of the frame, the frame having a first size and the rectangular canonical window having a second size that is smaller than the first size. 11. The computing system of claim 10 , wherein the instructions, when executed by the processor, further enable the computing system to: identify a plurality of matching unique fingerprints based at least on comparing the first set of unique fingerprints to the second set of unique fingerprints;

Assignees

Inventors

Classifications

  • used signal is a video-frame or a video-field (P.I.P) · CPC title

  • Electronic editing of digitised analogue information signals, e.g. audio or video signals · CPC title

  • Indexing; Addressing; Timing or synchronising; Measuring tape travel · CPC title

  • Indicating arrangements  {(indicating means incorporated in magazine or cassette G11B23/046 and G11B23/0875; indicating measured values in general G01D)} · CPC title

  • Coded signal uses a correlation function for detection · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9984728B2 cover?
Various embodiments identify differences between frame sequences of a video. For example, to determine a difference between two versions of a video, a fingerprint of each frame of the two versions is generated. From the fingerprints, a run-length encoded representation of each version is generated. The fingerprints which appear only once (i.e., unique fingerprints) in the entire video are ident…
Who is the assignee on this patent?
A9 Com Inc
What technology area does this patent fall under?
Primary CPC classification G11B27/3081. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 29 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).