Who is the assignee on this patent?

Microsoft Technology Licensing Llc

What technology area does this patent fall under?

Primary CPC classification G10L15/26. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue May 21 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Automated meeting minutes generator

US11990132B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11990132-B2
Application number	US-202318176180-A
Country	US
Kind code	B2
Filing date	Feb 28, 2023
Priority date	May 29, 2020
Publication date	May 21, 2024
Grant date	May 21, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A transcription of audio speech included in electronic content associated with a meeting is created by an ASR model trained on speech-to-text data. The transcription is post-processed by modifying text included in the transcription, for example, by modifying punctuation, grammar, or formatting introduced by the ASR model and by changing or omitting one or more words that were included in both the audio speech and the transcription. After the transcription is post-processed, output based on the post-processed transcription is generated in the form of a meeting summary and/or template.

First claim

Opening claim text (preview).

What is claimed is: 1. A computing system for automatically processing electronic content and for generating corresponding output, the computing system comprises: one or more processors; and one or more computer readable hardware storage devices having stored computer-executable instructions that are executable by the one or more processors to cause the computing system to at least: identify electronic content associated with a meeting, the electronic content including audio speech; create a transcription of the audio speech with an automatic speech recognition (ASR) model trained on speech-to-text training data, the transcription being a text-based transcription; use a machine learning model trained on post-processing training data for modifying text included in the transcription to generate a post-processed transcription which includes text modified from the transcription; and generate output based on the post-processed transcription, the output comprising a template that is generated at least in part from the post-processed transcription, the template comprising a meeting template that is automatically selected from a plurality of different templates based on a meeting type that is determined from analyzing the post-processed transcription and which is automatically populated with content from the post-processed transcription. 2. The computing system of claim 1 , wherein the post-processing includes modifying at least one of a punctuation, grammar or formatting of the transcription that was introduced by the ASR model. 3. The computing system of claim 1 , wherein the post-processing includes omitting one or more words in the transcription. 4. The computing system of claim 1 , wherein the post-processing includes modifying text to improve a readability of the transcription. 5. The computing system of claim 4 , wherein the readability of the transcription is improved by converting a spoken language style of the audio speech to a written language style. 6. The computing system of claim 4 , wherein the readability of the transcription is improved by determining a level of readability of individual words and phrases of the transcription and at least (1) removing words corresponding to a low level of readability, or (2) substituting words corresponding to a low level of readability with words corresponding to an increased level of readability, wherein the determining the level of readability is based on the individual words and phrases contributing to a semantic meaning and/or desired style inferred from the transcription. 7. The computing system of claim 1 , wherein the transcription includes a plurality of links corresponding to tags associated with the electronic content and wherein the computer-executable instructions are further executable by the one or more processors to cause the computing system to generate the tags from the electronic content. 8. The computing system of claim 7 , wherein the plurality of links point to data related to the electronic content, but wherein the data related to the electronic content is external to the electronic content. 9. The computing system of claim 1 , wherein one or more fields of the template are automatically populated with content identified in one or more tags that are generated by a speech tag machine learning model that processes at least one of the audio speech, the transcription, or the post-processed transcription. 10. A computer-implemented method for automatically processing electronic content and for generating corresponding output, the method comprising: identify electronic content associated with a meeting, the electronic content including audio speech; create a transcription of the audio speech with an automatic speech recognition (ASR) model trained on speech-to-text training data, the transcription being a text-based transcription; use a machine learning model trained on post-processing training data for modifying text included in the transcription to generate a post-processed transcription which includes text modified from the transcription; and generate output based on the post-processed transcription, the output comprising a template that is generated at least in part from the post-processed transcription, the template comprising a meeting template that is automatically selected from a plurality of different templates based on a meeting type that is determined from analyzing the post-processed transcription and which is automatically populated with content from the post-processed transcription. 11. The method of claim 10 , wherein the post-processing includes modifying at least one of a punctuation, grammar or formatting of the transcription that was introduced by the ASR model. 12. The method of claim 10 , wherein the post-processing includes changing one or more words in the transcription. 13. The method of claim 10 , wherein the post-processing includes omitting one or more words in the transcription. 14. The method of claim 10 , wherein the post-processing includes modifying text to improve a readability of the transcription. 15. The method of claim 14 , wherein the readability of the transcription is improved by converting a spoken language style of the audio speech to a written language style. 16. The method of claim 14 , wherein the readability of the transcription is improved by determining a level of readability of individual words and phrases of the transcription and at least (1) removing words corresponding to a low level of readability, or (2) substituting words corresponding to a low level of readability with words corresponding to an increased level of readability, wherein the determining the level of readability is based on the individual words and phrases contributing to a semantic meaning and/or desired style inferred from the transcription. 17. The method of claim 10 , wherein the transcription includes a plurality of links corresponding to tags associated with the electronic content and wherein the computer-executable instructions are further executable by the one or more processors to cause the computing system to generate the tags from the electronic content. 18. The method of claim 17 , wherein the plurality of links point to data related to the electronic content, but wherein the data related to the electronic content is external to the electronic content. 19. The method of claim 10 , wherein one or more fields of the template are automatically populated with content identified in one or more tags that are generated by a speech tag machine learning model that processes at least one of the audio speech, the transcription, or the post-processed transcription. 20. One or more hardware storage devices comprising computer-executable instructions that are executable by one or more processers of a computing system to cause the computing system to: identify electronic content associated with a meeting, the electronic content including audio speech; create a transcription of the audio speech with an automatic speech recognition (ASR) model trained on speech-to-text training data, the transcription being a text-based transcription; using a machine learning model trained on post-processing training data for modifying text included in the transcription; and generate output based from the post-processed transcription, the output comprising at least one of: (i) a meeting summary generated by a machine learning summarization model that summarizes content of the post-processed transcription by at least breaking the post-processed transcription i

Assignees

Microsoft Technology Licensing Llc

Inventors

Classifications

G06N3/0455
Auto-encoder networks; Encoder-decoder networks · CPC title
G06N3/09
Supervised learning · CPC title
G06Q10/063118
Staff planning in a project environment · CPC title
G10L15/26Primary
Speech to text systems (G10L15/08 takes precedence) · CPC title
G06F16/383
using metadata automatically derived from the content · CPC title

Patent family

Related publications grouped by family.

View patent family 75588298

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11990132B2 cover?: A transcription of audio speech included in electronic content associated with a meeting is created by an ASR model trained on speech-to-text data. The transcription is post-processed by modifying text included in the transcription, for example, by modifying punctuation, grammar, or formatting introduced by the ASR model and by changing or omitting one or more words that were included in both t…
Who is the assignee on this patent?: Microsoft Technology Licensing Llc
What technology area does this patent fall under?: Primary CPC classification G10L15/26. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue May 21 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).