Determining that audio includes music and then identifying the music as a particular song

US10809968B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10809968-B2
Application numberUS-201816148338-A
CountryUS
Kind codeB2
Filing dateOct 1, 2018
Priority dateOct 3, 2017
Publication dateOct 20, 2020
Grant dateOct 20, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In general, the subject matter described in this disclosure can be embodied in methods, systems, and program products. A computing device stores reference song characterization data and receives digital audio data. The computing device determines whether the digital audio data represents music and then performs a different process to recognize that the digital audio data represents a particular reference song. The computing device then outputs an indication of the particular reference song.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, comprising: storing, by a computing device, reference song characterization data that identify a plurality of audio characteristics for each reference song in a plurality of reference songs; receiving, by the computing device, digital audio data that represents audio recorded by a microphone; determining, by a first processor of the computing device and using a music determination process, whether the digital audio data represents music, wherein the determining includes converting the digital audio data from a time-domain format into a first frequency-domain format, the first frequency-domain format having a first number of frequency range bins; recognizing, by a second processor of the computing device after determining that the digital audio data represents music, that the digital audio data represents a particular reference song from among the plurality of reference songs, wherein the recognizing includes converting the digital audio data from the time-domain format into a second frequency-domain format having a second number of frequency range bins that is greater than the first number of frequency range bins, wherein the first processor and the second processor are distinct hardware processors included in the computing device, the first processor operating at a lower voltage than the second processor; and outputting, by the computing device in response to recognizing that the digital audio data represents a particular reference song from among the plurality of reference songs, an indication of the particular reference song. 2. The computer-implemented method of claim 1 , wherein the plurality of reference songs includes at least ten thousand reference songs, such that the reference song characterization data identify audio characteristics for the at least ten thousand reference songs. 3. The computer-implemented method of claim 1 , wherein the reference song characterization data includes reference song characterization values, wherein the reference song characterization values for the reference songs in the plurality of reference songs are limited to a binary one or a binary zero, such that each reference song characterization value is limited to a binary one or a binary zero. 4. The computer-implemented method of claim 1 , wherein determining whether the digital audio data represents music includes: using the digital audio data in the first frequency-domain format in the music determination process, and outputting an indication that the digital audio data represents music. 5. The computer-implemented method of claim 4 , wherein the music determination process includes executing a machine learning system that has been trained to determine whether audio represents music. 6. The computer-implemented method of claim 4 , wherein the frequency-domain conversion process is a first frequency-domain conversion process, and recognizing that the digital audio data represents the particular reference song includes: (i) converting the digital audio data from the time-domain format into the second frequency-domain format during a second frequency-domain conversion process, (ii) using the digital audio data in the second frequency-domain format in a music-characterization process that receives the digital audio data in the second frequency-domain format and outputs a collection of characterization values for the digital audio data, and (iii) comparing the collection of characterization values for the digital audio data to a plurality of characterization values for each of at least a subset of the plurality of reference songs, to determine that the plurality of characterization values for the particular reference song are most relevant to the collection of characterization values for the digital audio data. 7. The computer-implemented method of claim 6 , wherein the music-characterization process is performed by a machine learning system that has been trained to characterize music. 8. The computer-implemented method of claim 6 , wherein comparing the collection of characterization values for the digital audio data to the plurality of characterization values for each of at least a subset of the plurality of reference songs is performed by accessing the plurality of characterization values that are stored by the computing device for each of the at least subset of the plurality of songs without sending a request for song characterization data to another computing device. 9. The computer-implemented method of claim 6 , wherein the computing device compares the collection of characterization values for the digital audio data to the plurality of characterization values for each of only the subset of the plurality of reference songs, and the method further comprises: comparing the characterization values in the collection of characterization values for the digital audio data to the plurality of characterization values for each of the plurality of candidate songs to select the subset of the plurality of reference songs. 10. The computer-implemented method of claim 6 , further comprising converting the characterization values in the collection of characterization values for the digital audio data from values that are not all limited to binary zeros and ones to values that are limited to binary zeros and ones; wherein comparing the characterization values in the collection of characterization values for the digital audio to the plurality of characterization values for each of the plurality of candidate songs includes a comparison in which: (a) the characterization values for the digital audio data are limited to binary zeros and binary ones, and (b) the characterization values for each of the plurality of songs are limited to binary zeros and binary ones. 11. The computer-implemented method of claim 6 , wherein comparing the collection of characterization values for the digital audio data to the plurality of characterization values for each of at least a subset of the plurality of reference songs includes a comparison in which: (a) the characterization values for the digital audio data include real numbers that represent values other than binary zeros and binary ones, and (b) the characterization values for each of the at least subset of the plurality of songs are limited to binary zeros and binary ones. 12. The computer-implemented method of claim 1 , wherein the computing device determines whether the digital audio data represents music without accessing the reference song characterization data that identify the plurality of audio characteristics for each reference song in the plurality of reference songs. 13. The computer-implemented method of claim 1 , wherein determining whether the digital audio data represents music includes the computing device determining, multiple times without receipt of user input that initiates the music determination process, that the digital audio data does not represent music, before determining that the digital audio data represents music. 14. The computer-implemented method of claim 1 , wherein the second processor operates from a clock signal that is at least an order of magnitude faster than a clock signal from which the first processor operates. 15. The computer-implemented method of claim 1 , wherein outputting the indication that the particular reference song is playing includes presenting a name of the particular reference song on a lock screen of the computing device, in an always on screen of the computing device, or in a notification presented over an unlocked screen of the computing device, without user input having prompted the com

Assignees

Inventors

Classifications

  • Supervised learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • G06F16/683Primary

    using metadata automatically derived from the content · CPC title

  • G06F3/165Primary

    Management of the audio stream, e.g. setting of volume, audio stream path · CPC title

  • Filtering based on additional data, e.g. user or group profiles · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10809968B2 cover?
In general, the subject matter described in this disclosure can be embodied in methods, systems, and program products. A computing device stores reference song characterization data and receives digital audio data. The computing device determines whether the digital audio data represents music and then performs a different process to recognize that the digital audio data represents a particular…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/683. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 20 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).