Auto-translation for multi user audio and video

US9720909B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9720909-B2
Application numberUS-201514827826-A
CountryUS
Kind codeB2
Filing dateAug 17, 2015
Priority dateDec 12, 2011
Publication dateAug 1, 2017
Grant dateAug 1, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The disclosed subject matter provides a system, computer readable storage medium, and a method providing an audio and textual transcript of a communication. A conferencing services may receive audio or audio visual signals from a plurality of different devices that receive voice communications from participants in a communication, such as a chat or teleconference. The audio signals representing voice (speech) communications input into respective different devices by the participants. A translation services server may receive over a separate communication channel the audio signals for translation into a second language. As managed by the translation services server, the audio signals may be converted into textual data. The textual data may be translated into text of different languages based the language preferences of the end user devices in the teleconference. The translated text may be further translated into audio signals.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, comprising: receiving, at a first end user device associated with a user, an audio data signal from a conferencing services server via a first communication channel, the audio data signal representing a communication in a first spoken language received at a second end user device that is intended for the user; determining, at the first end user device, the first spoken language of the communication represented by the audio data signal; determining, at the first end user device, language preferences of the user of the first end user device; comparing, at the first end user device, the determined first spoken language with the language preferences of the user of the at the first end user device; and when the determined first spoken language does not match the language preferences: establishing, at the first end user device, a second communication channel with a translation services server, wherein the second communication channel with the translation services server is separate from first communication channel with the conferencing services server, providing, from the first end user device, the audio data signal to the translation services server, providing, from the first end user device, the language preferences of the user to the translation services server, and receiving, at the first end user device, a translated audio data signal from the translation services server, the translated audio data signal representing the communication in a second spoken language corresponding to the language preferences and different from the first spoken language. 2. The method of claim 1 , wherein determining the first spoken language of the communication represented by the audio data signal comprises recognizing the first spoken language of the communication. 3. The method of claim 1 , wherein determining the first spoken language of the communication represented by the audio data signal comprises receiving a user preferences signal from the first end user device. 4. The method of claim 3 , wherein the user preferences signal is an identifier embedded in the audio data signal received from the conferencing services server. 5. The method of claim 1 , further comprising, when the determined first spoken language does not match the language preferences, outputting, at the first end user device, an audio output corresponding to the translated audio data signal. 6. The method of claim 1 , further comprising, when the determined first spoken language does match the language preferences, outputting, at the first end user device, an audio output corresponding to the audio data signal. 7. The method of claim 1 , further comprising receiving, at the first end user device, a transcript of the audio data signal or the translated audio data signal. 8. A computer program product having computer program code containing instructions embodied in a non-transitory machine readable storage medium, a processor executes the instructions to provide a method, the method comprising: receiving, at a first end user device associated with a user, an audio data signal from a conferencing services server via a first communication channel, the audio data signal representing a communication in a first spoken language received at a second end user device that is intended for the user of the first end user device; determining, at the first end user device, the first spoken language of the communication represented by the audio data signal; determining, at the first end user device, language preferences of the user of the first end user device; comparing, at the first end user device, the determined first spoken language with the language preferences of the user of the at the first end user device; and when the determined first spoken language does not match the language preferences: establishing, at the first end user device, a second communication channel with a translation services, wherein the second communication channel with the translation services server is separate from first communication channel with the conferencing services server, providing, from the first end user device, the audio data signal to the translation services server, providing, from the first end user device, the language preferences of the user to the translation services server, and receiving, at the first end user device, a translated audio data signal from the translation services server, the translated audio data signal representing the communication in a second spoken language corresponding to the language preferences and different from the first spoken language. 9. The computer program product of claim 8 , wherein determining the first spoken language of the communication represented by the audio data signal comprises recognizing the first spoken language of the communication. 10. The computer program product of claim 8 , wherein determining the first spoken language of the communication represented by the audio data signal comprises receiving a user preferences signal from the first end user device. 11. The computer program product of claim 8 , wherein the method further comprises, when the determined first spoken language does not match the language preferences, outputting an audio output corresponding to the translated audio data signal. 12. The computer program product of claim 8 , wherein the method further comprises, when the determined first spoken language does match the language preferences, outputting an audio output corresponding to the audio data signal. 13. The computer program product of claim 8 , wherein the method further comprises receiving, at the first end user device, a transcript of the audio data signal or the translated audio data signal. 14. A first end user device associated with a user, comprising: one or more processors; and a non-transitory machine readable storage medium storing code containing instructions that, when executed by the one or more processors, provide a method comprising: receiving, at the first end user device, an audio data signal from a conferencing services server via a first communication channel, the audio data signal representing a communication in a first spoken language received at a second end user device that is intended for the user of the first end user device; determining, at the first end user device, the first spoken language of the communication represented by the audio data signal; determining, at the first end user device, language preferences of the user of the first end user device; comparing, at the first end user device, the determined first spoken language with the language preferences of the user of the at the first end user device; and when the determined first spoken language does not match the language preferences: establishing, at the first end user device, a second communication channel with a translation services server, wherein the second communication channel with the translation services server is separate from first communication channel with the conferencing services server, providing, from the first end user device, the audio data signal to the translation services server, providing, from the first end user device, the language preferences of the user to the translation services server, and receiving, at the first end user device, a translated audio data signal from the translation services server, the translated audio data signal representing the communication in a second spoken language corresponding to the language preferences and different from the first spoken language. 15. The first end user device of claim 14 ,

Assignees

Inventors

Classifications

  • G06F40/58Primary

    Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation · CPC title

  • G10L13/00Primary

    Speech synthesis; Text to speech systems · CPC title

  • Language aspects · CPC title

  • Multipoint control units therefor · CPC title

  • audio processing specific to telephonic conferencing, e.g. spatial distribution, mixing of participants (echo suppression in two-way loud-speaking telephone systems H04M9/02; sound field processing per se H04S7/30) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9720909B2 cover?
The disclosed subject matter provides a system, computer readable storage medium, and a method providing an audio and textual transcript of a communication. A conferencing services may receive audio or audio visual signals from a plurality of different devices that receive voice communications from participants in a communication, such as a chat or teleconference. The audio signals representing…
Who is the assignee on this patent?
Google Inc
What technology area does this patent fall under?
Primary CPC classification G06F40/58. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 01 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).