Who is the assignee on this patent?

Microsoft Technology Licensing Llc

What technology area does this patent fall under?

Primary CPC classification G10L17/22. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 10 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Communicating metadata that identifies a current speaker

US10586541B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10586541-B2
Application number	US-201715617907-A
Country	US
Kind code	B2
Filing date	Jun 8, 2017
Priority date	Mar 20, 2015
Publication date	Mar 10, 2020
Grant date	Mar 10, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer system may communicate metadata that identifies a current speaker. The computer system may receive audio data that represents speech of the current speaker, generate an audio fingerprint of the current speaker based on the audio data, and perform automated speaker recognition by comparing the audio fingerprint of the current speaker against stored audio fingerprints contained in a speaker fingerprint repository. The computer system may communicate data indicating that the current speaker is unrecognized to a client device of an observer and receive tagging information that identifies the current speaker from the client device of the observer. The computer system may store the audio fingerprint of the current speaker and metadata that identifies the current speaker in the speaker fingerprint repository and communicate the metadata that identifies the current speaker to at least one of the client device of the observer or a client device of a different observer.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer system for communicating metadata that identifies a current speaker, the computer system comprising: a computing device including a processor configured to execute computer-executable instructions and a memory operatively coupled to the processor, the memory storing one or more computer-executable instructions that, when executed by the processor, perform operations including: receive a request at the computing device to provide an alert when a current speaker is recognized to be a particular speaker; receive audio data at the computing device via a network from a communication device associated with the current speaker, the audio data representing speech of the current speaker; generate at the computing device an audio fingerprint of the current speaker based on the audio data received from the communication device via the network; perform automated speaker recognition at the computing device including comparing the audio fingerprint of the current speaker against one or more stored audio fingerprints contained in a speaker fingerprint repository; receive tagging information that identifies the current speaker from a device of an observer that identifies the current speaker; resolve a conflict between the tagging information received from the device of an observer that identifies the current speaker and an identification of the current speaker based on an identity obtained from tagging information from one or more other observers; and communicate the alert and the metadata that identifies the current speaker from the computing device to a client device of the observer when the current speaker is the particular speaker, the alert being based on the comparing of the audio fingerprint of the current speaker against one or more stored audio fingerprints. 2. The computer system of claim 1 , wherein the memory further stores one or more computer-executable instructions that, when executed by the processor, perform operations including: retrieve additional information for the current speaker from an information source; and communicate the additional information in the metadata that identifies the current speaker. 3. The computer system of claim 2 , wherein the additional information includes one or more of: a company of the current speaker, a department of the current speaker, a job title of the current speaker, or contact information for the current speaker. 4. The computer system of claim 1 , wherein the memory further stores one or more computer-executable instructions that, when executed by the processor, perform operations including: generate augmented audio data that includes the audio data that represents speech of the current speaker and the metadata that identifies the current speaker. 5. The computer system of claim 4 , wherein the metadata that identifies the current speaker is communicated to at least one of the client device of the observer or a second client device of the one or more other observers via the augmented audio data. 6. The computer system of claim 4 , wherein the memory further stores one or more computer-executable instructions that, when executed by the processor, perform operations including: store the augmented audio data; receive a query indicating a recognized speaker; search the augmented audio data for metadata that identifies the recognized speaker; and output portions of the augmented audio data that represent speech of the recognized speaker. 7. The computer system of claim 1 , wherein the memory further stores one or more computer-executable instructions that, when executed by the processor, perform operations including: generate a transcription of a conversation having multiple speakers, wherein text of speech spoken by a recognized speaker is associated with an identifier for the recognized speaker; store the transcription; receive a query indicating the recognized speaker, search the transcription for the identifier for the recognized speaker; and output portions of the transcription that include the text of speech spoken by the recognized speaker. 8. The computer system of claim 1 , wherein the memory further stores one or more computer-executable instructions that, when executed by the processor, perform operations including: communicate data indicating that the current speaker is unrecognized to the client device of the observer. 9. The computer system of claim 1 , wherein the memory further stores one or more computer-executable instructions that, when executed by the processor, perform operations including: store the audio fingerprint of the current speaker in the speaker fingerprint repository. 10. The computer system of claim 1 , wherein the memory further stores one or more computer-executable instructions that, when executed by the processor, perform operations including: provide an online meeting for participants; receive an audio fingerprint of a participant from at least one client device of at least one participant; and store the audio fingerprint of at least one participant and metadata that identifies at least one participant in the speaker fingerprint repository. 11. The computer system of claim 1 , wherein the memory further stores one or more computer-executable instructions that, when executed by the processor, perform operations including: communicate the audio fingerprint of the current speaker to the client device of the observer. 12. A computer-implemented method for communicating metadata that identifies a current speaker performed by a computer system including one or more computing devices, the computer-implemented method comprising: receiving a request at the one or more computing devices to provide an alert when a current speaker is recognized to be a particular speaker; generating an audio fingerprint of the current speaker using the one or more computing devices based on audio data that represents speech of the current speaker received via a network from a communication device of the current speaker, performing automated speaker recognition using the one or more computing devices including comparing the audio fingerprint of the current speaker with one or more stored audio fingerprints; receiving tagging information that identifies the current speaker from a device of an observer that identifies the current speaker; resolving a conflict between tagging information received from the device of the observer that identifies the current speaker and an identity obtained from tagging information from one or more other observers; and communicating the alert and metadata that identifies the current speaker from the one or more computing devices to a client device of the observer when the current speaker is the particular speaker, the alert being based on the comparing of the audio fingerprint of the current speaker with one or more stored audio fingerprints. 13. The computer-implemented method of claim 12 , further comprising: communicating data indicating that the current speaker is unrecognized to the client device of the observer. 14. The computer-implemented method of claim 12 , further comprising: communicating data indicating that the current speaker is unrecognized to the client device of the observer. 15. A computer-readable storage medium storing computer-executable instructions that, when executed by a computing device, cause the computing device to perform one or more operations comprising: receiving a request at the computing device to provide an alert when a current speaker is recognized to be a particular speaker; generating an audio fingerprint of the current speaker

Assignees

Microsoft Technology Licensing Llc

Inventors

Classifications

H04M2203/5081
Inform conference party of participants, e.g. of change of participants · CPC title
G10L17/22Primary
Interactive procedures; Man-machine interfaces · CPC title
G10L17/00Primary
Speaker identification or verification techniques · CPC title
H04M2203/6045
Identity confirmation · CPC title
H04M3/569
using the instant speaker's algorithm (speech detection per se G10L25/78) · CPC title

Patent family

Related publications grouped by family.

View patent family 55861142

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10586541B2 cover?: A computer system may communicate metadata that identifies a current speaker. The computer system may receive audio data that represents speech of the current speaker, generate an audio fingerprint of the current speaker based on the audio data, and perform automated speaker recognition by comparing the audio fingerprint of the current speaker against stored audio fingerprints contained in a sp…
Who is the assignee on this patent?: Microsoft Technology Licensing Llc
What technology area does this patent fall under?: Primary CPC classification G10L17/22. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 10 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Speaker Identification for Use in Multi-Media Conference Call System

Speaker recognition including proactive voice model retrieval and sharing features

Active talker activated conference pointers

Speaker recognition and voice tagging for improved service

Frequently asked questions