Identifying music as a particular song
US-10761802-B2 · Sep 1, 2020 · US
US10809968B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10809968-B2 |
| Application number | US-201816148338-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 1, 2018 |
| Priority date | Oct 3, 2017 |
| Publication date | Oct 20, 2020 |
| Grant date | Oct 20, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In general, the subject matter described in this disclosure can be embodied in methods, systems, and program products. A computing device stores reference song characterization data and receives digital audio data. The computing device determines whether the digital audio data represents music and then performs a different process to recognize that the digital audio data represents a particular reference song. The computing device then outputs an indication of the particular reference song.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method, comprising: storing, by a computing device, reference song characterization data that identify a plurality of audio characteristics for each reference song in a plurality of reference songs; receiving, by the computing device, digital audio data that represents audio recorded by a microphone; determining, by a first processor of the computing device and using a music determination process, whether the digital audio data represents music, wherein the determining includes converting the digital audio data from a time-domain format into a first frequency-domain format, the first frequency-domain format having a first number of frequency range bins; recognizing, by a second processor of the computing device after determining that the digital audio data represents music, that the digital audio data represents a particular reference song from among the plurality of reference songs, wherein the recognizing includes converting the digital audio data from the time-domain format into a second frequency-domain format having a second number of frequency range bins that is greater than the first number of frequency range bins, wherein the first processor and the second processor are distinct hardware processors included in the computing device, the first processor operating at a lower voltage than the second processor; and outputting, by the computing device in response to recognizing that the digital audio data represents a particular reference song from among the plurality of reference songs, an indication of the particular reference song. 2. The computer-implemented method of claim 1 , wherein the plurality of reference songs includes at least ten thousand reference songs, such that the reference song characterization data identify audio characteristics for the at least ten thousand reference songs. 3. The computer-implemented method of claim 1 , wherein the reference song characterization data includes reference song characterization values, wherein the reference song characterization values for the reference songs in the plurality of reference songs are limited to a binary one or a binary zero, such that each reference song characterization value is limited to a binary one or a binary zero. 4. The computer-implemented method of claim 1 , wherein determining whether the digital audio data represents music includes: using the digital audio data in the first frequency-domain format in the music determination process, and outputting an indication that the digital audio data represents music. 5. The computer-implemented method of claim 4 , wherein the music determination process includes executing a machine learning system that has been trained to determine whether audio represents music. 6. The computer-implemented method of claim 4 , wherein the frequency-domain conversion process is a first frequency-domain conversion process, and recognizing that the digital audio data represents the particular reference song includes: (i) converting the digital audio data from the time-domain format into the second frequency-domain format during a second frequency-domain conversion process, (ii) using the digital audio data in the second frequency-domain format in a music-characterization process that receives the digital audio data in the second frequency-domain format and outputs a collection of characterization values for the digital audio data, and (iii) comparing the collection of characterization values for the digital audio data to a plurality of characterization values for each of at least a subset of the plurality of reference songs, to determine that the plurality of characterization values for the particular reference song are most relevant to the collection of characterization values for the digital audio data. 7. The computer-implemented method of claim 6 , wherein the music-characterization process is performed by a machine learning system that has been trained to characterize music. 8. The computer-implemented method of claim 6 , wherein comparing the collection of characterization values for the digital audio data to the plurality of characterization values for each of at least a subset of the plurality of reference songs is performed by accessing the plurality of characterization values that are stored by the computing device for each of the at least subset of the plurality of songs without sending a request for song characterization data to another computing device. 9. The computer-implemented method of claim 6 , wherein the computing device compares the collection of characterization values for the digital audio data to the plurality of characterization values for each of only the subset of the plurality of reference songs, and the method further comprises: comparing the characterization values in the collection of characterization values for the digital audio data to the plurality of characterization values for each of the plurality of candidate songs to select the subset of the plurality of reference songs. 10. The computer-implemented method of claim 6 , further comprising converting the characterization values in the collection of characterization values for the digital audio data from values that are not all limited to binary zeros and ones to values that are limited to binary zeros and ones; wherein comparing the characterization values in the collection of characterization values for the digital audio to the plurality of characterization values for each of the plurality of candidate songs includes a comparison in which: (a) the characterization values for the digital audio data are limited to binary zeros and binary ones, and (b) the characterization values for each of the plurality of songs are limited to binary zeros and binary ones. 11. The computer-implemented method of claim 6 , wherein comparing the collection of characterization values for the digital audio data to the plurality of characterization values for each of at least a subset of the plurality of reference songs includes a comparison in which: (a) the characterization values for the digital audio data include real numbers that represent values other than binary zeros and binary ones, and (b) the characterization values for each of the at least subset of the plurality of songs are limited to binary zeros and binary ones. 12. The computer-implemented method of claim 1 , wherein the computing device determines whether the digital audio data represents music without accessing the reference song characterization data that identify the plurality of audio characteristics for each reference song in the plurality of reference songs. 13. The computer-implemented method of claim 1 , wherein determining whether the digital audio data represents music includes the computing device determining, multiple times without receipt of user input that initiates the music determination process, that the digital audio data does not represent music, before determining that the digital audio data represents music. 14. The computer-implemented method of claim 1 , wherein the second processor operates from a clock signal that is at least an order of magnitude faster than a clock signal from which the first processor operates. 15. The computer-implemented method of claim 1 , wherein outputting the indication that the particular reference song is playing includes presenting a name of the particular reference song on a lock screen of the computing device, in an always on screen of the computing device, or in a notification presented over an unlocked screen of the computing device, without user input having prompted the com
Supervised learning · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
using metadata automatically derived from the content · CPC title
Management of the audio stream, e.g. setting of volume, audio stream path · CPC title
Filtering based on additional data, e.g. user or group profiles · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.