Who is the assignee on this patent?

Microsoft Technology Licensing Llc

What technology area does this patent fall under?

Primary CPC classification G10L15/26. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jun 01 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Customized output to optimize for user preference in a distributed system

US11023690B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11023690-B2
Application number	US-201916398836-A
Country	US
Kind code	B2
Filing date	Apr 30, 2019
Priority date	Apr 30, 2019
Publication date	Jun 1, 2021
Grant date	Jun 1, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for providing customized output based on a user preference in a distributed system are provided. In example embodiments, a meeting server or system receives audio streams from a plurality of distributed devices involved in an intelligent meeting. The meeting system identifies a user corresponding to a distributed device of the plurality of distributed devices and determines a preferred language of the user. A transcript from the received audio streams is generated. The meeting system translates the transcript into the preferred language of the user to form a translated transcript. The translated transcript is provided to the distributed device of the user.

First claim

Opening claim text (preview).

The invention claimed is: 1. A computer-implemented method comprising: receiving audio streams captured by a plurality of distributed devices; comparing the audio streams captured by the plurality of devices to determine whether the audio streams are representative of sound from a same meeting; based on the comparing indicating the audio streams are representative of sound from the same meeting, establishing a meeting instance for an intelligent meeting; identifying a user corresponding to a distributed device of the plurality of distributed devices in the intelligent meeting; determining a preferred language of the user; generating, by a hardware processor, a transcript from the received audio streams as the intelligent meeting is occurring; translating the transcript into the preferred language of the user to form a translated transcript as the intelligent meeting is occurring; and providing the translated transcript to the distributed device as the intelligent meeting is occurring. 2. The method of claim 1 , wherein providing the translated transcript comprises providing the transcript with translated text for display on the distributed device as the intelligent meeting is occurring. 3. The method of claim 1 , wherein providing the translated transcript comprises converting text of the translated transcript to speech for output to the user of the distributed device as the intelligent meeting is occurring. 4. The method of claim 1 , wherein providing the translated transcript comprises providing speaker identities for each translated utterance of the transcript. 5. The method of claim 1 , wherein the determining the preferred language of the user comprises accessing a user preference previously established for the user indicating the preferred language. 6. The method of claim 1 , wherein the comparing the audio streams captured by the plurality of devices to determine that the audio streams are representative of sound from the same meeting comprises: calculating normalized cross correlation coefficients between signals of the audio streams; and determining whether a predetermined threshold is transgressed, wherein the predetermined threshold being transgressed indicates that the audio streams are representative of sound from the same meeting. 7. The method of claim 1 , further comprising: performing continuous speech separation on the received audio streams from the plurality of distributed devices to separate speech from different speakers speaking at the same time into separate audio channels, the generating the transcript being based on the separated audio channels. 8. The method of claim 1 , wherein identifying the user comprises: receiving a video signal capturing the user; and matching a stored image of the user with the video signal to identify the user. 9. The method of claim 1 , wherein identifying the user comprises: matching a stored voice signature of the user with speech from the audio streams. 10. The method of claim 1 , wherein identifying the user comprises: obtaining a user identifier associated with the distributed device. 11. A non-transitory machine-storage medium having instructions for execution by a processor of a machine to cause the processor to perform operations comprising: receiving audio streams captured by a plurality of distributed devices; comparing the audio streams captured by the plurality of devices to determine whether the audio streams are representative of sound from a same meeting; based on the comparing indicating the audio streams are representative of sound from the same meeting, establishing a meeting instance for an intelligent meeting; identifying a user corresponding to a distributed device of the plurality of distributed devices in the intelligent meeting; determining a preferred language of the user; generating a transcript from the received audio streams as the intelligent meeting is occurring; translating the transcript into the preferred language of the user to form a translated transcript as the intelligent meeting is occurring; and providing the translated transcript to the distributed device as the intelligent meeting is occurring. 12. The machine-storage medium of claim 11 , wherein providing the translated transcript comprises providing the transcript with translated text for display on the distributed device as the intelligent meeting is occurring. 13. The machine-storage medium of claim 11 wherein providing the translated transcript comprises converting text of the translated transcript to speech for output to the user of the distributed device as the intelligent meeting is occurring. 14. The machine-storage medium of claim 11 , wherein providing the translated transcript comprises providing speaker identities for each translated utterance of the transcript. 15. The machine-storage medium of claim 11 , wherein the determining the preferred language of the user comprises accessing a user preference previously established for the user indicating the preferred language. 16. The machine-storage medium of claim 11 , wherein comparing the audio streams captured by the plurality of devices to determine that the audio streams are representative of sound from the same meeting comprises: calculating normalized cross correlation coefficients between signals of the audio streams; and determining whether a predetermined threshold is transgressed, wherein the predetermined threshold being transgressed indicates that the audio streams are representative of sound from the same meeting. 17. The machine-storage medium of claim 11 , wherein the operations further comprise: performing continuous speech separation on the received audio streams from the plurality of distributed devices to separate speech from different speakers speaking at the same time into separate audio channels, the generating the transcript being based on the separated audio channels. 18. The machine-storage medium of claim 11 , wherein identifying the user comprises: receiving a video signal capturing the user; and matching a stored image of the user with the video signal to identify the user. 19. The machine-storage medium of claim 11 , wherein identifying the user comprises: matching a stored voice signature of the user with speech from the audio streams. 20. A device comprising: one or more hardware processors; and a memory device coupled to the processor and having a program stored thereon that, when executed by the one or more hardware processors, causes the one or more hardware processors to perform operations comprising: receiving audio streams captured by a plurality of distributed devices; comparing the audio streams captured by the plurality of devices to determine whether the audio streams are representative of sound from a same meeting; based on the comparing indicating the audio streams are representative of sound from the same meeting, establishing a meeting instance for an intelligent meeting; identifying a user corresponding to a distributed device of the plurality of distributed devices in the intelligent meeting; determining a preferred language of the user; generating a transcript from the received audio streams as the intelligent meeting is occurring; translating the transcript into the preferred language of the user to form a translated transcript as the intelligent meeting is occurring; and providing the translated transcript to the distributed device as the intelligent meeting is occurring.

Assignees

Microsoft Technology Licensing Llc

Inventors

Classifications

G06N3/044
Recurrent networks, e.g. Hopfield networks · CPC title
G06N3/045
Combinations of networks · CPC title
H04L65/403
Arrangements for multi-party communication, e.g. for conferences (data switching systems for conference H04L12/18; arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities H04M3/56; television conferencing systems H04N7/15) · CPC title
H04L12/1813
for computer conferences, e.g. chat rooms (instant messaging H04L51/04; protocols for multimedia communication H04L65/1101; arrangements for multi-party communication H04L65/403; telephonic conference arrangements H04M3/56; television conference systems H04N7/15) · CPC title
G10L2021/02166
Microphone arrays; Beamforming · CPC title

Patent family

Related publications grouped by family.

View patent family 70277494

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11023690B2 cover?: Systems and methods for providing customized output based on a user preference in a distributed system are provided. In example embodiments, a meeting server or system receives audio streams from a plurality of distributed devices involved in an intelligent meeting. The meeting system identifies a user corresponding to a distributed device of the plurality of distributed devices and determines …
Who is the assignee on this patent?: Microsoft Technology Licensing Llc
What technology area does this patent fall under?: Primary CPC classification G10L15/26. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jun 01 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).