Method, system and article of manufacture for processing spatial audio

US2016198282A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016198282-A1
Application numberUS-201514807760-A
CountryUS
Kind codeA1
Filing dateJul 23, 2015
Priority dateJan 2, 2015
Publication dateJul 7, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques for processing directionally-encoded audio to account for spatial characteristics of a listener playback environment are disclosed. The directionally-encoded audio data includes spatial information indicative of one or more directions of sound sources in an audio scene. The audio data is modified based on input data identifying the spatial characteristics of the playback environment. The spatial characteristics may correspond to actual loudspeaker locations in the playback environment. The directionally-encoded audio may also be processed to permit focusing/defocusing on sound sources or particular directions in an audio scene. The disclosed techniques may allow a recorded audio scene to be more accurately reproduced at playback time, regardless of the output loudspeaker setup. Another advantage is that a user may dynamically configure audio data so that it better conforms to the user's particular loudspeaker layouts and/or the user's desired focus on particular subjects or areas in an audio scene.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method of processing audio, comprising: receiving, at a device, audio data corresponding to a scene, wherein the audio data includes spatial information indicative of one or more directions of one or more sound sources in the scene; and modifying the audio data based on input data identifying one or more spatial characteristics of a playback environment. 2 . The method of claim 1 , further comprising: receiving a selection identifying one or more enabled regions in the scene; and modifying the audio data corresponding to the one or more enabled regions. 3 . The method of claim 2 , wherein the selection is based on an operational mode of the device. 4 . The method of claim 2 , wherein the operational mode of the device is selected from the group consisting of front camera enabled and back camera enabled. 5 . The method of claim 2 , wherein the device includes a camera and the selection is based on the zoom operation of the camera. 6 . The method of claim 1 , further comprising: providing a user interface configured to permit a user to select one or more enabled regions in the scene; receiving, through the user interface, a selection of at least one enabled region in the scene; and modifying the audio data corresponding to the at least one enabled region. 7 . The method of claim 1 , further comprising: receiving the input data through a user interface that permits the user to configure the input data according to the one or more spatial characteristics of the playback environment. 8 . The method of claim 1 , wherein the input data includes a sector definition indicating a region in the playback environment. 9 . The method of claim 8 , wherein the sector definition corresponds to a loudspeaker location in the playback environment. 10 . The method of claim 8 , wherein modifying the audio data includes applying a masking window function to the audio data, wherein the masking window function corresponds to the sector definition. 11 . An apparatus, comprising: an interface configured to receive audio data corresponding to a scene, wherein the audio data includes spatial information indicative of one or more directions of one or more sound sources in the scene; and a processor configured to modify the audio data based on input data identifying one or more spatial characteristics of a playback environment. 12 . The apparatus of claim 11 , further comprising a second interface configured to receive a selection identifying one or more enabled regions in the scene; wherein the processor is configured to modify the audio data corresponding to the one or more enabled regions. 13 . The apparatus of claim 12 , wherein the selection is based on an operational mode of the system. 14 . The apparatus of claim 12 , wherein the system includes a camera and the selection is based on the zoom operation of the camera. 15 . The apparatus of claim 11 , further comprising a user interface configured to permit a user to select one or more enabled regions in the scene; wherein the processor is configured to modify the audio data corresponding to the one or more enabled regions. 16 . The apparatus of claim 11 , further comprising: a user interface to permit a user to configure the input data according to the one or more spatial characteristics of the playback environment. 17 . The apparatus of claim 11 , wherein the input data includes a sector definition indicating a region in the playback environment. 18 . The apparatus of claim 17 , wherein the sector definition corresponds to a loudspeaker location in the playback environment. 19 . The apparatus of claim 17 , wherein the processor is configured to modify the audio data by applying a masking window function to the audio data, wherein the masking window function corresponds to the sector definition. 20 . The apparatus of claim 1 , further comprising: a module configured to render the modified audio data for playback. 21 . An apparatus, comprising: means for receiving audio data corresponding to a scene, wherein the audio data includes spatial information indicative of one or more directions of one or more sound sources in the scene; and means for modifying the audio data based on input data identifying one or more spatial characteristics of a playback environment. 22 . The apparatus of claim 21 , further comprising: means for receiving a selection identifying one or more enabled regions in the scene; and means for modifying the audio data corresponding to the one or more enabled regions. 23 . The apparatus of claim 22 , wherein the selection is based on an operational mode of the system. 24 . The apparatus of claim 22 , wherein the system includes a camera and the selection is based on the zoom operation of the camera. 25 . The apparatus of claim 21 , further comprising: means for providing a user interface configured to permit a user to select one or more enabled regions in the scene; means for receiving, through the user interface, a selection of at least one enabled region in the scene; and means for modifying the audio data corresponding to the at least one enabled region. 26 . The apparatus of claim 21 , further comprising: means for receiving the input data through a user interface that permits the user to configure the input data according to the one or more spatial characteristics of the playback environment. 27 . The apparatus of claim 21 , wherein the input data includes a sector definition indicating a region in the playback environment. 28 . The apparatus of claim 27 , wherein the sector definition corresponds to a loudspeaker location in the playback environment. 29 . The apparatus of claim 27 , wherein modifying the audio data includes applying a masking window function to the audio data, wherein the masking window function corresponds to the sector definition. 30 . A non-transient computer-readable medium embodying a set of instructions executable by one or more processors, comprising: code for receiving audio data corresponding to a scene, wherein the audio data includes spatial information indicative of one or more directions of one or more sound sources in the scene; and code for modifying the audio data based on input data identifying one or more spatial characteristics of a playback environment.

Assignees

Inventors

Classifications

  • Aspects of sound capture and related signal processing for recording or reproduction · CPC title

  • Non-adaptive circuits, e.g. manually adjustable or static, for enhancing the sound image or the spatial distribution (control circuits for electronic adaptation of the sound field H04S7/30) · CPC title

  • H04S7/30Primary

    Control circuits for electronic adaptation of the sound field · CPC title

  • Management of the audio stream, e.g. setting of volume, audio stream path · CPC title

  • Automatic calibration of stereophonic sound system, e.g. with test microphone · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016198282A1 cover?
Techniques for processing directionally-encoded audio to account for spatial characteristics of a listener playback environment are disclosed. The directionally-encoded audio data includes spatial information indicative of one or more directions of sound sources in an audio scene. The audio data is modified based on input data identifying the spatial characteristics of the playback environment.…
Who is the assignee on this patent?
Qualcomm Inc
What technology area does this patent fall under?
Primary CPC classification H04S7/30. Mapped technology areas include Electricity.
When was this patent published?
Publication date Thu Jul 07 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).