What technology area does this patent fall under?

Primary CPC classification G10L19/032. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Feb 07 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Systems and methods for encoding audio signals

US9564140B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9564140-B2
Application number	US-201514680360-A
Country	US
Kind code	B2
Filing date	Apr 7, 2015
Priority date	Apr 7, 2015
Publication date	Feb 7, 2017
Grant date	Feb 7, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Some embodiments relate to techniques for encoding an audio signal represented by a plurality of frames including a first frame. The techniques include using at least one computer hardware processor to perform: obtaining an initial discrete spectral representation of the first frame; obtaining a primary discrete spectral representation of the initial discrete spectral representation at least in part by estimating a phase envelope of the initial discrete spectral representation and evaluating the estimated phase envelope at a discrete set of frequencies; calculating a residual discrete spectral representation of the initial discrete spectral representation based on the initial discrete spectral representation and the primary discrete spectral representation; and encoding the residual discrete spectral representation using a plurality of codewords.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for encoding an audio signal represented by a plurality of frames including a first frame, the method comprising: using at least one computer hardware processor to perform: obtaining an initial discrete spectral representation of the first frame; obtaining a primary discrete spectral representation of the initial discrete spectral representation at least in part by estimating a phase envelope of the initial discrete spectral representation and evaluating the estimated phase envelope at a discrete set of frequencies; calculating a residual discrete spectral representation of the initial discrete spectral representation based on the initial discrete spectral representation and the primary discrete spectral representation; encoding the residual discrete spectral representation using a plurality of codewords to obtain an encoded residual discrete spectral representation; and outputting parameters representing the primary discrete spectral representation and the encoded residual discrete spectral representation. 2. The method of claim 1 , wherein estimating the phase envelope comprises estimating parameters of a continuous-in-frequency representation of the phase envelope. 3. The method of claim 2 , wherein estimating the parameters of the continuous-in -frequency representation of the phase envelope comprises estimating a plurality of Mel-frequency regularized cepstral coefficients. 4. The method of claim 1 , wherein obtaining the primary discrete spectral representation further comprises estimating an amplitude envelope of the initial discrete spectral representation and evaluating the estimated amplitude envelope at the discrete set of frequencies. 5. The method of claim 1 , wherein obtaining the initial discrete spectral representation of the first frame comprises fitting a sinusoidal model to the first frame. 6. The method of claim 1 , wherein encoding the residual discrete spectral representation using the plurality of codewords comprises encoding the residual discrete spectral representation using a linear combination of stochastic codewords, the stochastic codewords selected from the plurality of codewords. 7. The method of claim 6 , wherein a first stochastic codeword in the linear combination of stochastic codewords is obtained by: generating a stochastic time-domain signal comprising portions corresponding to sub-frames of the first frame including a first portion corresponding to a first sub-frame of the first frame; setting values of the stochastic time-domain signal outside of the first portion to zero to obtain a sub-frame codeword; converting the sub-frame codeword to a frequency domain to obtain a frequency-domain sub-frame codeword; and setting values of the frequency-domain sub-frame codeword to zero outside of a sub-band to obtain the first stochastic codeword. 8. The method of claim 1 , wherein encoding the residual discrete spectral representation comprises iteratively selecting codewords in the plurality of codewords based at least in part on a perceptual measure. 9. A system for encoding an audio signal represented by a plurality of frames including a first frame, the system comprising: at least one non-transitory memory storing a plurality of codewords; and at least one computer hardware processor configured to perform: obtaining an initial discrete spectral representation of the first frame; obtaining a primary discrete spectral representation of the initial discrete spectral representation at least in part by estimating a phase envelope of the initial discrete spectral representation and evaluating the estimated phase envelope at a discrete set of frequencies; calculating a residual discrete spectral representation of the initial discrete spectral representation based on the initial discrete spectral representation and the primary discrete spectral representation; encoding the residual discrete spectral representation using a plurality of codewords to obtain an encoded residual discrete spectral representation; and outputting parameters representing the primary discrete spectral representation and the encoded residual discrete spectral representation. 10. The system of claim 9 , wherein estimating the phase envelope comprises estimating parameters of a continuous-in-frequency representation of the phase envelope. 11. The system of claim 10 , wherein estimating the parameters of the continuous-in -frequency representation of the phase envelope comprises estimating a plurality of Mel-frequency regularized cepstral coefficients. 12. The system of claim 9 , wherein obtaining the primary discrete spectral representation further comprises estimating an amplitude envelope of the initial discrete spectral representation and evaluating the estimated amplitude envelope at the discrete set of frequencies. 13. The system of claim 9 , wherein encoding the residual discrete spectral representation using the plurality of codewords comprises encoding the residual discrete spectral representation using a linear combination of stochastic codewords, the stochastic codewords selected from the plurality of codewords. 14. The system of claim 13 , wherein a first stochastic codeword in the linear combination of stochastic codewords is obtained by: generating a stochastic time-domain signal comprising portions corresponding to sub-frames of the first frame including a first portion corresponding to a first sub-frame of the first frame; setting values of the stochastic time-domain signal outside of the first portion to zero to obtain a sub-frame codeword; converting the sub-frame codeword to a frequency domain to obtain a frequency-domain sub-frame codeword; and setting values of the frequency-domain sub-frame codeword to zero outside of a sub-band to obtain the first stochastic codeword. 15. At least one non-transitory computer-readable storage medium storing processor executable instructions that, when executed by at least one computer hardware processor, cause the at least one computer hardware processor to perform a method for encoding an audio signal represented by a plurality of frames including a first frame, the method comprising: obtaining an initial discrete spectral representation of the first frame; obtaining a primary discrete spectral representation of the initial discrete spectral representation at least in part by estimating a phase envelope of the initial discrete spectral representation and evaluating the estimated phase envelope at a discrete set of frequencies; calculating a residual discrete spectral representation of the initial discrete spectral representation based on the initial discrete spectral representation and the primary discrete spectral representation; encoding the residual discrete spectral representation using a plurality of codewords to obtain an encoded residual discrete spectral representation; and outputting parameters representing the primary discrete spectral representation and the encoded residual discrete spectral representation. 16. The at least one non-transitory computer-readable storage medium of claim 15 , wherein estimating the phase envelope comprises estimating parameters of a continuous-in -frequency representation of the phase envelope. 17. The at least one non-transitory computer-readable storage medium of claim 16 , wherein estimating the parameters of the continuous-in-frequency representation of the phase envelope comprises estimating a plurality of Mel-frequency regularized cepstral coefficients. 18. The at least one non-transitory computer-rea

Assignees

Nuance Communications Inc

Inventors

Classifications

G10L19/032Primary
Quantisation or dequantisation of spectral components · CPC title
G10L19/038
Vector quantisation, e.g. TwinVQ audio · CPC title
G10L19/02Primary
using spectral analysis, e.g. transform vocoders or subband vocoders · CPC title

Patent family

Related publications grouped by family.

View patent family 57111922

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9564140B2 cover?: Some embodiments relate to techniques for encoding an audio signal represented by a plurality of frames including a first frame. The techniques include using at least one computer hardware processor to perform: obtaining an initial discrete spectral representation of the first frame; obtaining a primary discrete spectral representation of the initial discrete spectral representation at least in…
Who is the assignee on this patent?: Nuance Communications Inc
What technology area does this patent fall under?: Primary CPC classification G10L19/032. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Feb 07 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).