Automated meeting minutes generator

US11990132B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11990132-B2
Application numberUS-202318176180-A
CountryUS
Kind codeB2
Filing dateFeb 28, 2023
Priority dateMay 29, 2020
Publication dateMay 21, 2024
Grant dateMay 21, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A transcription of audio speech included in electronic content associated with a meeting is created by an ASR model trained on speech-to-text data. The transcription is post-processed by modifying text included in the transcription, for example, by modifying punctuation, grammar, or formatting introduced by the ASR model and by changing or omitting one or more words that were included in both the audio speech and the transcription. After the transcription is post-processed, output based on the post-processed transcription is generated in the form of a meeting summary and/or template.

First claim

Opening claim text (preview).

What is claimed is: 1. A computing system for automatically processing electronic content and for generating corresponding output, the computing system comprises: one or more processors; and one or more computer readable hardware storage devices having stored computer-executable instructions that are executable by the one or more processors to cause the computing system to at least: identify electronic content associated with a meeting, the electronic content including audio speech; create a transcription of the audio speech with an automatic speech recognition (ASR) model trained on speech-to-text training data, the transcription being a text-based transcription; use a machine learning model trained on post-processing training data for modifying text included in the transcription to generate a post-processed transcription which includes text modified from the transcription; and generate output based on the post-processed transcription, the output comprising a template that is generated at least in part from the post-processed transcription, the template comprising a meeting template that is automatically selected from a plurality of different templates based on a meeting type that is determined from analyzing the post-processed transcription and which is automatically populated with content from the post-processed transcription. 2. The computing system of claim 1 , wherein the post-processing includes modifying at least one of a punctuation, grammar or formatting of the transcription that was introduced by the ASR model. 3. The computing system of claim 1 , wherein the post-processing includes omitting one or more words in the transcription. 4. The computing system of claim 1 , wherein the post-processing includes modifying text to improve a readability of the transcription. 5. The computing system of claim 4 , wherein the readability of the transcription is improved by converting a spoken language style of the audio speech to a written language style. 6. The computing system of claim 4 , wherein the readability of the transcription is improved by determining a level of readability of individual words and phrases of the transcription and at least (1) removing words corresponding to a low level of readability, or (2) substituting words corresponding to a low level of readability with words corresponding to an increased level of readability, wherein the determining the level of readability is based on the individual words and phrases contributing to a semantic meaning and/or desired style inferred from the transcription. 7. The computing system of claim 1 , wherein the transcription includes a plurality of links corresponding to tags associated with the electronic content and wherein the computer-executable instructions are further executable by the one or more processors to cause the computing system to generate the tags from the electronic content. 8. The computing system of claim 7 , wherein the plurality of links point to data related to the electronic content, but wherein the data related to the electronic content is external to the electronic content. 9. The computing system of claim 1 , wherein one or more fields of the template are automatically populated with content identified in one or more tags that are generated by a speech tag machine learning model that processes at least one of the audio speech, the transcription, or the post-processed transcription. 10. A computer-implemented method for automatically processing electronic content and for generating corresponding output, the method comprising: identify electronic content associated with a meeting, the electronic content including audio speech; create a transcription of the audio speech with an automatic speech recognition (ASR) model trained on speech-to-text training data, the transcription being a text-based transcription; use a machine learning model trained on post-processing training data for modifying text included in the transcription to generate a post-processed transcription which includes text modified from the transcription; and generate output based on the post-processed transcription, the output comprising a template that is generated at least in part from the post-processed transcription, the template comprising a meeting template that is automatically selected from a plurality of different templates based on a meeting type that is determined from analyzing the post-processed transcription and which is automatically populated with content from the post-processed transcription. 11. The method of claim 10 , wherein the post-processing includes modifying at least one of a punctuation, grammar or formatting of the transcription that was introduced by the ASR model. 12. The method of claim 10 , wherein the post-processing includes changing one or more words in the transcription. 13. The method of claim 10 , wherein the post-processing includes omitting one or more words in the transcription. 14. The method of claim 10 , wherein the post-processing includes modifying text to improve a readability of the transcription. 15. The method of claim 14 , wherein the readability of the transcription is improved by converting a spoken language style of the audio speech to a written language style. 16. The method of claim 14 , wherein the readability of the transcription is improved by determining a level of readability of individual words and phrases of the transcription and at least (1) removing words corresponding to a low level of readability, or (2) substituting words corresponding to a low level of readability with words corresponding to an increased level of readability, wherein the determining the level of readability is based on the individual words and phrases contributing to a semantic meaning and/or desired style inferred from the transcription. 17. The method of claim 10 , wherein the transcription includes a plurality of links corresponding to tags associated with the electronic content and wherein the computer-executable instructions are further executable by the one or more processors to cause the computing system to generate the tags from the electronic content. 18. The method of claim 17 , wherein the plurality of links point to data related to the electronic content, but wherein the data related to the electronic content is external to the electronic content. 19. The method of claim 10 , wherein one or more fields of the template are automatically populated with content identified in one or more tags that are generated by a speech tag machine learning model that processes at least one of the audio speech, the transcription, or the post-processed transcription. 20. One or more hardware storage devices comprising computer-executable instructions that are executable by one or more processers of a computing system to cause the computing system to: identify electronic content associated with a meeting, the electronic content including audio speech; create a transcription of the audio speech with an automatic speech recognition (ASR) model trained on speech-to-text training data, the transcription being a text-based transcription; using a machine learning model trained on post-processing training data for modifying text included in the transcription; and generate output based from the post-processed transcription, the output comprising at least one of: (i) a meeting summary generated by a machine learning summarization model that summarizes content of the post-processed transcription by at least breaking the post-processed transcription i

Assignees

Inventors

Classifications

  • Auto-encoder networks; Encoder-decoder networks · CPC title

  • Supervised learning · CPC title

  • Staff planning in a project environment · CPC title

  • G10L15/26Primary

    Speech to text systems (G10L15/08 takes precedence) · CPC title

  • using metadata automatically derived from the content · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11990132B2 cover?
A transcription of audio speech included in electronic content associated with a meeting is created by an ASR model trained on speech-to-text data. The transcription is post-processed by modifying text included in the transcription, for example, by modifying punctuation, grammar, or formatting introduced by the ASR model and by changing or omitting one or more words that were included in both t…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G10L15/26. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 21 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).