Sound representation via winner-take-all coding of auditory spectra
US-9158842-B1 · Oct 13, 2015 · US
US11954149B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11954149-B2 |
| Application number | US-202218050326-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 27, 2022 |
| Priority date | Mar 31, 2017 |
| Publication date | Apr 9, 2024 |
| Grant date | Apr 9, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques of content unification are disclosed. In some example embodiments, a computer-implemented method comprises: determining clusters based a comparison of a plurality of audio content using a first matching criteria, each cluster of the plurality of clusters comprising at least two audio content from the plurality of audio content; for each cluster of the plurality of clusters, determining a representative audio content for the cluster from the at least two audio content of the cluster; loading the corresponding representative audio content of each cluster into an index; matching the query audio content to one of the representative audio contents using a first matching criteria; determining the corresponding cluster of the matched representative audio content; and identifying a match between the query audio content and at least one of the audio content of the cluster of the matched representative audio content based on a comparison using a second matching criteria.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method comprising: determining, by at least one hardware processor, a representative audio content for a cluster, wherein the cluster comprises at least two audio contents; loading, by the at least one hardware processor, the representative audio content into an index, wherein the representative audio content is stored in association with a hash value, and wherein the hash value is associated with a candidate reference identifier; removing, by the at least one hardware processor, candidate reference identifiers that appear less than a threshold number of times; in response to removing the candidate reference identifiers that appear less than a threshold number of times, generating a first comparison, by the at least one hardware processor, of a query audio content to each representative audio content associated with a remaining set of candidate reference identifiers, wherein the remaining set of candidate reference identifiers does not include the removed candidate reference identifiers, and wherein the first comparison is generated using a first matching criteria; and matching, by the at least one hardware processor, the query audio content to one of the representative audio content based on the generated first comparison. 2. The computer-implemented method of claim 1 , further comprising: determining, by at least one hardware processor, a plurality of clusters based on a comparison of a plurality of audio contents using the first matching criteria, each cluster of the plurality of clusters comprising at least two audio contents from the plurality of audio contents, and wherein the determining the plurality of clusters comprises comparing fingerprint data of each of the plurality of audio contents using the first matching criteria. 3. The computer-implemented method of claim 2 , wherein the first comparison includes a granular comparison based on a sub-cluster of each of the plurality of clusters. 4. The computer-implemented method of claim 2 , wherein the fingerprint data comprises at least one of: (i) nano-fingerprint; (ii) a micro-fingerprint; and (iii) a full fingerprint. 5. The computer-implemented method of claim 1 , wherein the first comparison comprises a comparison of at least one of: (i) a content duration ratio; (ii) a bit error rate at a matching location; and (iii) or a length of matching positions. 6. The computer-implemented method of claim 1 , wherein the hash value is based on permutations of a binary vector formed using a spectral representation of the audio content. 7. The computer-implemented method of claim 1 , further comprising determining, by the at least one hardware processor, the corresponding cluster of the matched one of the representative audio content. 8. The computer-implemented method of claim 7 , further including generating a second comparison, by the at least one hardware processor, of the query audio content to each one of the at least two audio contents of a corresponding cluster of the matched one of the representative audio content using a second matching criteria different from the first matching criteria. 9. The computer-implemented method of claim 8 , further including identifying, by the at least one hardware processor, a match between the query audio content and at least one of the audio contents of the corresponding cluster of the matched one of the representative audio content based on the generated second comparison of the corresponding cluster using the first and the second matching criteria, the match used to determine similarity of the queried audio content to the representative audio content. 10. The computer-implemented method of claim 1 , wherein the corresponding representative audio content of the cluster is the audio content that is loaded into the index. 11. The computer-implemented method of claim 1 , wherein the matching of the query audio content to one of the representative audio content comprises comparing fingerprint data of the query audio content with fingerprint data of each of the representative audio content in the index using the first matching criteria. 12. The computer-implemented method of claim 1 , wherein each one of the at least two audio contents comprises a song. 13. A system comprising: at least one processor; and a non-transitory computer-readable medium storing executable instructions that, when executed, cause the at least one processor to perform operations comprising: determining a representative audio content for a cluster, wherein the cluster comprises at least two audio contents; loading the representative audio content into an index, wherein the representative audio content is stored in association with a hash value, and wherein the hash value is associated with a candidate reference identifier; removing candidate reference identifiers that appear less than a threshold number of times; in response to removing the candidate reference identifiers that appear less than a threshold number of times, generating a first comparison of a query audio content to each representative audio content associated with a remaining set of candidate reference identifiers, wherein the remaining set of candidate reference identifiers does not include the removed candidate reference identifiers, and wherein the first comparison is generated using a first matching criteria; and matching the query audio content to one of the representative audio content based on the generated first comparison. 14. The system of claim 13 , wherein the operations further comprise: determining a plurality of clusters based on a comparison of a plurality of audio contents using the first matching criteria, each cluster of the plurality of clusters comprising at least two audio contents from the plurality of audio contents, and wherein the determining the plurality of clusters comprises comparing fingerprint data of each of the plurality of audio contents using the first matching criteria. 15. The system of claim 14 , wherein the first comparison includes a granular comparison based on a sub-cluster of each of the plurality of clusters. 16. The system claim 14 , wherein the fingerprint data comprises at least one of: (i) nano-fingerprint; (ii) a micro-fingerprint; and (iii) a full fingerprint. 17. The system of claim 13 , wherein the first comparison comprises a comparison of at least one of: (i) a content duration ratio; (ii) a bit error rate at a matching location; and (iii) or a length of matching positions. 18. The system of claim 14 , wherein the hash value is based on permutations of a binary vector formed using a spectral representation of the audio content. 19. The system of claim 14 , wherein each one of the at least two audio contents comprises a song. 20. A non-transitory machine-readable storage medium, tangibly embodying a set of instructions that, when executed by at least one processor, causes the at least one processor to perform operations comprising: determining a representative audio content for a cluster, wherein the cluster comprises at least two audio contents; loading the representative audio content into an index, wherein the representative audio content is stored in association with a hash value, and wherein the hash value is associated with a candidate reference identifier; removing candidate reference identifiers that appear less than a threshold number of times; in response to removing the candidate reference identifiers that appear less than a threshold number of times, generating a first comp
using metadata automatically derived from the content · CPC title
Indexing; Data structures therefor; Storage structures · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.