Speech characteristic recognition and conversion
US-10818308-B1 · Oct 27, 2020 · US
US12567394B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12567394-B2 |
| Application number | US-202217737216-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 5, 2022 |
| Priority date | May 5, 2022 |
| Publication date | Mar 3, 2026 |
| Grant date | Mar 3, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In examples, a method for converting audio samples to full song arrangements is provided. The method includes receiving audio sample data, determining a melodic transcription, based on the audio sample data, and determining a sequence of music chords, based on the melodic transcription. The method further includes generating a full song arrangement, based on the sequence of music chords, and the audio sample data.
Opening claim text (preview).
What is claimed is: 1 . A method for converting audio samples to full song arrangements, the method comprising: receiving audio sample data; determining a melodic transcription with a plurality of bars, based on the audio sample data; determining a sequence of music chords, based on the melodic transcription, wherein the determining of the sequence of music chords comprises: inputting the melodic transcription to a machine learning model, the machine learning model being trained based on a dataset of paired melody bars and chords; receiving from the machine learning model, a plurality of chord candidates for each bar of the plurality of bars of the melodic transcription; determining, from pre-defined chord progressions, one or more chord progressions corresponding to the plurality of chord candidates for each bar of the plurality of bars of the melodic transcription; and selecting the sequence of music chords from the determined one or more chord progressions; performing vocal processing on the audio sample data by dynamically time warping the audio sample data to fit the determined sequence of music chords; and generating a full song arrangement, based on the sequence of music chords and the vocally processed audio sample data. 2 . The method of claim 1 , wherein the pre-defined chord progressions are 4-bar chord progressions. 3 . The method of claim 1 , wherein the trained machine learning model is a neural network. 4 . The method of claim 3 , wherein the chords in the data set include maj, min, 7, min7, min7b5, aug, and sus4. 5 . The method of claim 1 , further comprising: displaying a user-interface; receiving, via the user-interface, a user-input corresponding to a selection of an accompaniment style of the full song arrangement; and re-generating the full song arrangement, based on the user-input. 6 . The method of claim 1 , wherein the audio sample data includes a subset of data corresponding to auditory words. 7 . The method of claim 1 , wherein the vocal processing further comprises: removing a subset of the audio sample data corresponding to ambient noise. 8 . The method of claim 7 , wherein the generating of the full song arrangement is based on the sequence of music chords, and the vocally processed audio sample data. 9 . The method of claim 7 , wherein the vocal processing further comprises: performing autotuning on the audio sample data; and normalizing a volume of the audio sample data. 10 . The method of claim 9 , wherein the vocal processing further comprises: beautifying the audio sample data, by applying one or more vocal effects from the group of: compressor adjustment, reverb adjustment, and chorus adjustment. 11 . The method of claim 1 , further comprising: receiving the audio sample data from an application on a mobile computing device; and transmitting the full song arrangement to the mobile computing device. 12 . A system for converting audio samples to full song arrangements, the system comprising: at least one processor; and memory storing instructions that, when executed by the at least one processor, causes the system to perform a set of operations, the set of operations including: receiving audio sample data; determining a melodic transcription with a plurality of bars, based on the audio sample data; determining a sequence of music chords, based on the melodic transcription, wherein the determining of the sequence of music chords comprises: inputting the melodic transcription to a machine learning model, the machine learning model being trained based on a dataset of paired melody bars and chords; receiving from the machine learning model, a plurality of chord candidates for each bar of the plurality of bars of the melodic transcription; determining, from pre-defined chord progressions, one or more chord progressions corresponding to the plurality of chord candidates for each bar of the plurality of bars of the melodic transcription; and selecting the sequence of music chords from the determined one or more chord progressions; performing vocal processing on the audio sample data by dynamically time warping the audio sample data to fit the determined sequence of music chords; and generating a full song arrangement, based on the sequence of music chords and the vocally processed audio sample data. 13 . The system of claim 12 , wherein the pre-defined chord progressions are 4-bar chord progressions. 14 . The method of claim 12 , wherein the trained machine learning model is a neural network. 15 . The method of claim 12 , wherein the vocal processing further comprises: removing a subset of the audio sample data corresponding to ambient noise; and performing autotuning on the audio sample data. 16 . The method of claim 15 , wherein the generating of the full song arrangement is based on the sequence of music chords, and the vocally processed audio sample data. 17 . The method of claim 15 , wherein the vocal processing further comprises: normalizing a volume of the audio sample data; and beautifying the audio sample data, by applying one or more vocal effects from the group of: compressor adjustment, reverb adjustment, and chorus adjustment. 18 . One or more computer readable non-transitory storage media embodying software that is operable when executed, by at least one processor of a device, to: receive audio sample data; determine a melodic transcription with a plurality of bars, based on the audio sample data; determine a sequence of music chords, based on the melodic transcription, wherein to determine the sequence of music chords comprises: inputting the melodic transcription to a machine learning model, the machine learning model being trained based on a dataset of paired melody bars and chords; receiving from the machine learning model, a plurality of chord candidates for each bar of the plurality of bars of the melodic transcription; determining, from pre-defined chord progressions, one or more chord progressions corresponding to the plurality of chord candidates for each bar of the plurality of bars of the melodic transcription; and selecting the sequence of music chords from the determined one or more chord progressions; perform vocal processing on the audio sample data by dynamically time warping the audio sample data to fit the determined sequence of music chords; and generate a full song arrangement, based on the sequence of music chords and the vocally processed audio sample data. 19 . The method of claim 1 , further comprising: estimating a beats per minute of the audio sample data, wherein the dynamic time warping is performed based on the estimates beats per minute. 20 . The method of claim 1 , wherein the selecting the sequence of music chords from the determined one or more chord progressions comprises: ranking the determined one or more chord progressions based on a probability of how well each chord progression of the one or more chord progressions matches the plurality of bars; and selecting the sequence of music chords based on their ranking.
Chord · CPC title
Neural networks for electrophonic musical instruments or musical processing, e.g. for musical recognition or control, automatic composition or improvisation · CPC title
Chord progression · CPC title
for transcription of raw audio or music data to a displayed or printed staff representation or to displayable MIDI-like note-oriented data, e.g. in pianoroll format · CPC title
Automatic or semi-automatic music composition, e.g. producing random music, applying rules from music theory or modifying a musical piece (automatically producing a series of tones G10H1/26) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.