Creating, rendering and interacting with a multi-faceted audio cloud

US10007724B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10007724-B2
Application numberUS-201213538988-A
CountryUS
Kind codeB2
Filing dateJun 29, 2012
Priority dateJun 29, 2012
Publication dateJun 26, 2018
Grant dateJun 26, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and arrangements for effecting a cloud representation of audio content. An audio cloud is created and rendered, and user interaction with at least a clip portion of the audio cloud is afforded.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus comprising: at least one processor; and a non-transitory computer readable storage medium having computer readable program code embodied therewith and executable by the at least one processor, the computer readable program code comprising: computer readable program code configured to segment audio provided in a first language not having available automatic speech recognition capabilities into speech units, wherein to segment comprises employing a language sub-word recognition technique selected from the group consisting of: a statistical system for sub-word unit recognition; a voice-activity-detection technique; and a syllable segmentation technique, wherein the language sub-word recognition technique comprises utilizing a sub-word recognition technique of a second language having available automatic speech recognition capabilities and different from the first language of the audio; computer readable program code configured to identify prominent speech units, wherein to identify comprises detecting a repeated speech unit by identifying speech patterns within the audio and using a language agnostic speech unit comparison technique, wherein the language agnostic speech unit comparison technique comprises a technique where a language associated with the speech unit is disregarded; wherein to identify further comprises determining a frequency of occurrence of a speech unit and wherein a prominent speech unit comprises a speech unit that exceeds a predetermined frequency of occurrence threshold; computer readable program code configured to create an audio cloud comprising audio signals of the prominent speech units, wherein each of the audio signals comprise a playable audio unit that when played provides an audible output from the audio of the corresponding prominent speech unit; computer readable program code configured to render the audio cloud, wherein the audio cloud comprises a visual representation of the audio signals, wherein the audio signals are arranged in order of decreasing frequency of occurrence and wherein a volume of the audio signals is based upon the frequency of occurrence; and computer readable program code configured to afford user interaction with at least a clip portion of the audio cloud. 2. A non-transitory computer program storage device comprising: a non-transitory computer readable storage device having computer readable program code embodied therewith, the computer readable program code comprising: computer readable program code configured to segment audio provided in a first language not having available automatic speech recognition capabilities into speech units, wherein to segment comprises employing a language sub-word recognition technique selected from the group consisting of: a statistical system for sub-word unit recognition; a voice-activity-detection technique; and a syllable segmentation technique, wherein the language sub-word recognition technique comprises utilizing a sub-word recognition technique of a second language having available automatic speech recognition capabilities and different from the first language of the audio; computer readable program code configured to identify prominent speech units, wherein to identify comprises detecting a repeated speech unit by identifying speech patterns within the audio and using a language agnostic speech unit comparison technique, wherein the language agnostic speech unit comparison technique comprises a technique where a language associated with the speech unit is disregarded; wherein to identify further comprises determining a frequency of occurrence of a speech unit and wherein a prominent speech unit comprises a speech unit that exceeds a predetermined frequency of occurrence threshold; computer readable program code configured to create an audio cloud comprising audio signals of the prominent speech units, wherein each of the audio signals comprise a playable audio unit that when played provides an audible output from the audio of the corresponding prominent speech unit; computer readable program code configured to render the audio cloud, wherein the audio cloud comprises a visual representation of the audio signals, wherein the audio signals are arranged in order of decreasing frequency of occurrence and wherein a volume of the audio signals is based upon the frequency of occurrence; and computer readable program code configured to afford user interaction with at least a clip portion of the audio cloud. 3. The non-transitory computer program storage device according to claim 2 , comprising computer readable program code configured to detect speech units. 4. The non-transitory computer program storage device according to claim 2 , wherein said computer readable program code is configured to render the audio cloud via at least one member selected from the group consisting of: audio-based rendering; and visual-display-based rendering. 5. The non-transitory computer program storage device according to claim 2 , wherein said computer readable program code is configured to afford the creating and rendering of the audio cloud as interactive based on user input. 6. The non-transitory computer program storage device according to claim 2 , wherein a language sub-word recognition technique comprises a speech analysis technique where accuracy of the technique is not dependant on the language and language characteristics of the speaker. 7. The non-transitory computer program storage device according to claim 2 , wherein the audio cloud comprises a plurality of audio segments. 8. The non-transitory computer program storage device according to claim 2 , wherein the prominent speech units within the rendered audio cloud are presented in an order based upon the prominence of the speech unit. 9. A non-transitory computer program storage device comprising: a non-transitory computer readable storage device having computer readable program code embodied therewith and executable by the at least one processor, the computer readable program code comprising: computer readable program code configured to segment audio provided in a first language not having available automatic speech recognition capabilities into speech units; wherein to segment comprises employing a language sub-word recognition technique selected from the group consisting of: a statistical system for sub-word unit recognition; a voice-activity-detection technique; and a syllable segmentation technique, wherein the language sub-word recognition technique comprises utilizing a sub-word recognition technique of a second language having available automatic speech recognition capabilities and different from the first language of the audio; computer readable program code configured to identify, by detecting a repeated speech unit by identifying speech patterns within the audio and via employing a language-agnostic speech unit comparison technique, prominent speech units within the audio, wherein the language agnostic speech unit comparison technique comprises a technique where a language associated with the speech unit is disregarded; wherein to identify further comprises determining a frequency of occurrence of a speech unit and wherein a prominent speech unit comprises a speech unit that exceeds a predetermined frequency of occurrence threshold; computer readable program code configured to create an audio cloud comprising audio signals of the identified prominent speech units, wherein each of the audio signals comprise a playable audio unit that when played provides an audible output from the audio of the corresponding prominent speech unit; computer readable program code configured to render the audio cloud, wherein the audio cloud comprises a visual representati

Assignees

Inventors

Classifications

  • G06F16/64Primary

    Browsing; Visualisation therefor (generation of a list or set of audio data G06F16/638) · CPC title

  • Segmentation; Word boundary detection · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10007724B2 cover?
Methods and arrangements for effecting a cloud representation of audio content. An audio cloud is created and rendered, and user interaction with at least a clip portion of the audio cloud is afforded.
Who is the assignee on this patent?
Ajmera Jitendra, Deshmukh Om Dadaji, Jain Anupam, and 3 more
What technology area does this patent fall under?
Primary CPC classification G06F16/64. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 26 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).