Signatures and labels in a blockchain derived from digital images
US-2024193394-A1 · Jun 13, 2024 · US
US9367887B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-9367887-B1 |
| Application number | US-201514880762-A |
| Country | US |
| Kind code | B1 |
| Filing date | Oct 12, 2015 |
| Priority date | Sep 5, 2013 |
| Publication date | Jun 14, 2016 |
| Grant date | Jun 14, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Implementations are provided herein relating to audiovisual matching. Audio and video channel data is merged to create a single multi-channel fingerprint used to match media content. Audio channel data is used to generate audio fingerprints. Video channel data is used to generate a video fingerprints. Multi-channel fingerprints can then be generated based on the audio channel fingerprints and video channel fingerprints. In this sense, entropy can be increased while the multi-channel fingerprint can be less resistant to noise.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method for generating multi-channel fingerprints, the method comprising: receiving audio channel data and video channel data associated with a video; generating a set of audio fingerprints based on the audio channel data; generating a set of mean frames of the video based on a sliding time window applied to the video channel data; generating a set of video fingerprints based on the set of mean frames of the video; and generating a set of multi-channel fingerprints based on both the set of audio fingerprints and the set of video fingerprints. 2. The computer-implemented method of claim 1 , further comprising: generating an audio spectrogram based on the audio channel data; and generating a downscaled audio spectrogram based on the audio spectrogram, wherein the set of audio fingerprints is generated based on the downscaled audio spectrogram and wherein audio fingerprints in the set of audio fingerprints are min-hashes. 3. The computer-implemented method of claim 2 , further comprising: generating a set of wavelet min-hashes based on the set of mean frames, wherein the set of video fingerprints is generated based on the set of wavelet min-hashes and wherein video fingerprints in the set of video fingerprints are min-hashes. 4. The computer-implemented method of claim 3 , wherein generating the set of multi-channel fingerprints is based on concatenating min-hashes of audio fingerprints from the set of audio fingerprints and the min-hashes of video fingerprints from the set of video fingerprints. 5. The computer-implemented method of claim 4 , wherein generating the set of multi-channel fingerprints is based on a consistent output rate. 6. The computer-implemented method of claim 3 , further comprising: generating a set of weighted audio min-hashes based on the set of audio fingerprints, an aggregate hash time window, and an audio channel identifier; generating a set of weighted video min-hashes based on the set of video fingerprints, the aggregate hash time window, and a video channel identifier; and generating a set of concatenated pairs based on the set of weighted audio min-hashes and the set of weighted video min-hashes wherein generating the set of multi-channel fingerprints is based on the set of concatenated pairs. 7. The computer-implemented method of claim 6 , wherein concatenated pairs in the set concatenated pairs are comprised of at least one weighted audio min-hash from the set of weighted audio min-hashes and at least one weighted video min-hash from the set of weighted video min-hashes. 8. The computer-implemented method of claim 2 , further comprising: generating a set of interest points based on the audio spectrogram; and generating a set of descriptors based on the set of interest points; wherein the set of audio fingerprints is generated based on the set of descriptors. 9. The computer-implemented method of claim 8 , further comprising: generating a set of pairs wherein each pair in the set of pairs contains an anchor interest point and a paired interest point; generating a third point for each pair in the set of pairs based on a search path wherein the third point is a time-frequency point of a maxima along the search path; generating a set of triples wherein respective triples in the set of triples contain the anchor interest point, the paired interest point and the third point; determining a binary bit associated with each triple in the set of triples based on whether the third point lies on a first half of the search path or a second half of the search path; and wherein generating descriptors in the set of descriptors is based on a triple in the set of triples and contains a quantized frequency of the anchor interest point, a first quantized frequency ratio of a frequency of the paired interest point and a frequency of the anchor interest point, a second quantized frequency ratio of a frequency of the third point and the frequency of the anchor interest point, a time span between the anchor interest point and the paired interest point, and the binary bit associated with the triple. 10. The computer-implemented method of claim 8 , further comprising: generating a set of video interest points based on the set of mean frames; and generating a set of quantized video interest points based on the set of video interest points wherein the set of video fingerprints is generated based on the set of quantized video interest points. 11. The computer-implemented method of claim 10 , wherein generating the set of multi-channel fingerprints comprises: combining an audio fingerprint from the set of audio fingerprints and a video fingerprint from the set of video fingerprints based on at least one of a common time offset, a closest in time offset, or a spatial similarity. 12. A computer program product comprising a non-transitory computer-readable storage medium storing executable code for generating multi-channel fingerprints, the code when executed by a computer processor cause the computer processor to perform steps comprising: receiving audio channel data and video channel data associated with a video; generating a set of audio fingerprints based on the audio channel data; generating a set of mean frames of the video based on a sliding time window applied to the video channel data; generating a set of video fingerprints based on the set of mean frames of the video; and generating a set of multi-channel fingerprints based on both the set of audio fingerprints and the set of video fingerprints. 13. The computer program product of claim 1 , wherein the code when executed by the computer processor causes the computer processor to perform further steps comprising: generating an audio spectrogram based on the audio channel data; and generating a downscaled audio spectrogram based on the audio spectrogram, wherein the set of audio fingerprints is generated based on the downscaled audio spectrogram and wherein audio fingerprints in the set of audio fingerprints are min-hashes. 14. The computer program product of claim 13 , wherein the code when executed by the computer processor causes the computer processor to perform further steps comprising: generating a set of wavelet min-hashes based on the set of mean frames, wherein the set of video fingerprints is generated based on the set of wavelet min-hashes and wherein video fingerprints in the set of video fingerprints are min-hashes. 15. The computer program product of claim 14 , wherein generating the set of multi-channel fingerprints is based on concatenating min-hashes of audio fingerprints from the set of audio fingerprints and the min-hashes of video fingerprints from the set of video fingerprints. 16. The computer program product of claim 15 , wherein generating the set of multi-channel fingerprints is based on a consistent output rate. 17. The computer program product of claim 14 , wherein the code when executed by the computer processor causes the computer processor to perform further steps comprising: generating a set of weighted audio min-hashes based on the set of audio fingerprints, an aggregate hash time window, and an audio channel identifier; generating a set of weighted video min-hashes based on the set of video fingerprints, the aggregate hash time window, and a video channel identifier; and generating a set of concatenated pairs based on the set of weighted audio min-hashes and the set of weighted video min-hashes wherein generating the set of multi-channel fingerprints is based on the set of concatenated pairs.
Image watermarking · CPC title
Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames · CPC title
using low-level visual features of the video content · CPC title
using audio features · CPC title
using metadata automatically derived from the content · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.