Who is the assignee on this patent?

Dolby Laboratories Licensing Corp, Dolby Int Ab

What technology area does this patent fall under?

Primary CPC classification H04S7/30. Mapped technology areas include Electricity.

When was this patent published?

Publication date Thu Nov 17 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Spatial error metrics of audio content

US2016337776A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2016337776-A1
Application number	US-201515110371-A
Country	US
Kind code	A1
Filing date	Jan 5, 2015
Priority date	Jan 9, 2014
Publication date	Nov 17, 2016
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Audio objects that are present in input audio content in one or more frames are determined. Output clusters that are present in output audio content in the one or more frames are also determined. Here, the audio objects in the input audio content are converted to the output clusters in the output audio content. One or more spatial error metrics are computed based at least in part on positional metadata of the audio objects and positional metadata of the output clusters.

First claim

Opening claim text (preview).

1 . A method, comprising: determining a plurality of audio objects that are present in input audio content in one or more frames; determining a plurality of output clusters that are present in output audio content in the one or more frames, the plurality of audio objects in the input audio content being converted to the plurality of output clusters in the output audio content; and computing one or more spatial error metrics based at least in part on positional metadata of the plurality of audio objects and positional metadata of the plurality of output clusters; wherein the method is performed by one or more computing devices. 2 . The method as recited in claim 1 , wherein the one or more spatial error metrics are at least in part dependent on object importance. 3 . The method as recited in claim 2 , wherein the object importance is obtained from analyzing one or more of audio data in the plurality of audio objects, audio data in the plurality of output clusters, metadata in the plurality of audio objects, or metadata in the plurality of output clusters. 4 . The method as recited in claim 2 , wherein at least a portion of the object importance is determined based on user input. 5 . The method as recited in claim 1 , wherein at least one audio object in the plurality of audio objects is apportioned to two or more output clusters in the plurality of output clusters. 6 . The method as recited in claim 1 , wherein at least one audio object in the plurality of audio objects is assigned to an output cluster in the plurality of output clusters. 7 . The method as recited in claim 1 , further comprising: determining, based on the one or more spatial error metrics, perceptual audio quality degradation caused by converting the plurality of audio objects in the input audio content to the plurality of output clusters in the output clusters. 8 . The method as recited in claim 7 , wherein the perceptual audio quality degradation is represented by one or more predicted test scores relating to a perceptual audio quality test. 9 . The method as recited in claim 1 , wherein the one or more spatial error metrics comprise at least one of: intra-frame spatial error metrics or inter-frame spatial error metrics. 10 . The method as recited in claim 9 , wherein the intra-frame spatial error metrics comprise at least one of: intra-frame object position error metrics, intra-frame object panning error metrics, importance-weighted intra-frame object position error metrics, importance-weighted intra-frame object panning error metrics, normalized intra-frame object position error metrics, or normalized intra-frame object panning error metrics. 11 . The method as recited in claim 9 , wherein the inter-frame spatial error metrics comprise at least one of: inter-frame spatial error metrics based on gain coefficient flows, or inter-frame spatial error metrics not based on gain coefficient flows. 12 . The method as recited in claim 9 , wherein each of the inter-frame spatial error metrics is computed in relation to two or more different frames. 13 . The method as recited in claim 1 , wherein the plurality of audio objects relates to the plurality of output clusters via a plurality of gain coefficients. 14 . The method as recited in claim 1 , wherein each of the frames corresponds to a time segment in the input audio content and a second time segment in the output audio content; and wherein output clusters that are present in the second time segment in the output audio content are mapped to by audio objects that are present in the first time segment in the input audio content. 15 . (canceled) 16 . The method as recited in claim 1 , further comprising: constructing one or more user interface components that represent one or more of: audio objects in the plurality of audio objects, or output clusters in the plurality of output clusters in a listening space; causing the one or more user interface components to be displayed to a user. 17 . The method as recited in claim 16 , wherein a user interface component in the one or more user interface components represents an audio object in the plurality of audio objects; wherein the audio object is mapped to one or more output clusters in the plurality of output clusters; and wherein at least one visual characteristic of the user interface component represents a total amount of one or more spatial errors related to mapping the audio object to the one or more output clusters. 18 . The method as recited in claim 16 , wherein the one or more user interface components comprise a representation of the listening space in a 3-dimensional (3-D) form. 19 . The method as recited in claim 16 , wherein the one or more user interface components comprise a representation of the listening space in a 2-dimensional (2-D) form. 20 . The method as recited in claim 1 , further comprising: constructing one or more user interface components that represent one or more of: respective object importance of audio objects in the plurality of audio objects, respective object importance of output clusters in the plurality of output clusters, respective loudness of audio objects in the plurality of audio objects, respective loudness of output clusters in the plurality of output clusters, respective probabilities of speech or dialog content of audio objects in the plurality of audio objects, or probabilities of speech or dialog content of output clusters in the plurality of output clusters; causing the one or more user interface components to be displayed to a user. 21 - 32 . (canceled) 33 . A non-transitory computer readable storage medium, storing software instructions, which when executed by one or more processors cause performance of any one of the methods recited in claim 1 .

Assignees

Inventors

Classifications

H04S3/008
in which the audio signals are in digital form, i.e. employing more than two discrete digital channels (data reduction aspects thereof based on psychoacoustics G10L19/02) · CPC title
F24C15/2028
using an air curtain · CPC title
G10L19/008
Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing · CPC title
H04S2400/13
Aspects of volume control, not necessarily automatic, in stereophonic sound systems · CPC title
G10L25/48
specially adapted for particular use · CPC title

Patent family

Related publications grouped by family.

View patent family 52469071

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016337776A1 cover?: Audio objects that are present in input audio content in one or more frames are determined. Output clusters that are present in output audio content in the one or more frames are also determined. Here, the audio objects in the input audio content are converted to the output clusters in the output audio content. One or more spatial error metrics are computed based at least in part on positional …
Who is the assignee on this patent?: Dolby Laboratories Licensing Corp, Dolby Int Ab
What technology area does this patent fall under?: Primary CPC classification H04S7/30. Mapped technology areas include Electricity.
When was this patent published?: Publication date Thu Nov 17 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).