Automatically altering the audio of an object during video conferences
US-10581625-B1 · Mar 3, 2020 · US
US11636859B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11636859-B2 |
| Application number | US-202117526757-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 15, 2021 |
| Priority date | May 10, 2019 |
| Publication date | Apr 25, 2023 |
| Grant date | Apr 25, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method to present a summary of a transcription may include obtaining, at a first device, audio directed to the first device from a second device during a communication session between the first device and the second device. Additionally, the method may include sending, from the first device, the audio to a transcription system. The method may include obtaining, at the first device, a transcription during the communication session from the transcription system based on the audio. Additionally, the method may include obtaining, at the first device, a summary of the transcription during the communication session. Additionally, the method may include presenting, on a display, both the summary and the transcription simultaneously during the communication session.
Opening claim text (preview).
What is claimed is: 1. A method comprising: obtaining, at a first device, a transcription of audio of a communication session involving the first device; obtaining a level of user understanding of the transcription, the level of the user understanding of the transcription being determined based on behavior of the user; in response to the level of user understanding satisfying a threshold, obtaining, at the first device, a summary of the transcription; and presenting, on a display, at least one of the summary and the transcription. 2. The method of claim 1 , wherein obtaining the summary of the transcription includes the first device generating the summary during the communication session using the transcription of the communication session. 3. The method of claim 1 , wherein both the summary and the transcription are presented on the display simultaneously. 4. The method of claim 1 , further comprising: obtaining, at the first device, the audio of the communication session; and sending, from the first device, the audio to a transcription system, wherein the first device obtains the transcription from the transcription system. 5. The method of claim 1 , wherein the behavior of the user used to determine the level of the user understanding of the transcription includes one or more of: image data of the user and sound data from the audio. 6. The method of claim 5 , wherein the image data includes one or more of facial expressions of the user and a location of focus of eyes of the user. 7. The method of claim 5 , wherein the sound data includes one or more of words spoken by the user and audio characteristics of speech of the user. 8. The method of claim 1 , further comprising ceasing to present the summary in response to an indication of an occurrence of an event associated with the communication session. 9. A system comprising: a display; at least one processor coupled to the display and configured to direct data to be presented on the display; and at least one computer-readable media coupled to the processor and configured to store one or more instructions that when executed by the processor cause or direct the system to perform operations comprising: obtaining a transcription of audio of a communication session involving the system; obtaining a level of user understanding of the transcription, the level of the user understanding of the transcription being determined based on behavior of the user; in response to the level of user understanding satisfying a threshold, obtaining a summary of the transcription; and directing presentation on the display of at least one of the summary and the transcription. 10. The system of claim 9 , wherein obtaining the summary of the transcription includes generating the summary during the communication session using the transcription of the communication session. 11. The system of claim 9 , wherein both the summary and the transcription are directed to be presented on the display simultaneously. 12. The system of claim 9 , wherein the behavior of the user used to determine the level of the user understanding of the transcription includes one or more of: image data of the user and sound data from the audio. 13. The system of claim 12 , wherein the image data includes one or more of facial expressions of the user and a location of focus of eyes of the user. 14. The system of claim 12 , wherein the sound data includes one or more of words spoken by the user and audio characteristics of speech of the user. 15. The system of claim 9 , wherein the operations further comprise: obtaining the audio of the communication session; and sending the audio to a transcription system, wherein the system obtains the transcription from the transcription system. 16. A system comprising: at least one processor; and at least one computer-readable media coupled to the processor and configured to store one or more instructions that when executed by the processor cause or direct the system to perform operations, the operations comprising: obtaining a transcription of audio of a communication session involving a first device; obtaining a level of user understanding of the transcription, the level of the user understanding of the transcription being determined based on behavior of the user; in response to the level of user understanding satisfying a threshold, obtaining a summary of the transcription; and providing one or more of the transcription and the summary to the first device for presentation. 17. The system of claim 16 , wherein the operations to provide one or more of the transcription and the summary includes providing both the transcription and the summary to the first device for presentation during the communication session. 18. The system of claim 16 , wherein the behavior of the user used to determine the level of the user understanding of the transcription includes one or more of: image data of the user and sound data from the audio. 19. The system of claim 18 , wherein the image data includes one or more of facial expressions of the user and a location of focus of eyes of the user. 20. The system of claim 18 , wherein the sound data includes one or more of words spoken by the user and audio characteristics of speech of the user.
Speech to text systems (G10L15/08 takes precedence) · CPC title
Details of the transformation process · CPC title
Distributed recognition, e.g. in client-server systems, for mobile phones or network applications · CPC title
Summarisation for human users · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.