Automated meeting minutes generation service
US-2021375291-A1 · Dec 2, 2021 · US
US12033619B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12033619-B2 |
| Application number | US-202017095797-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 12, 2020 |
| Priority date | Nov 12, 2020 |
| Publication date | Jul 9, 2024 |
| Grant date | Jul 9, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The exemplary embodiments disclose a method, a computer program product, and a computer system for transcribing media. The exemplary embodiments may include collecting media, extracting one or more features from the media, and transcribing the media based on the extracted one or more features and one or more models.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method for transcribing media, the method comprising: collecting media of a user, wherein the media comprises content of a presentation given by the user; extracting one or more features from the media, wherein the extracting is performed using machine learning techniques comprising a convolutional neural network and long short-term memory to parse the collected media and extract the one or more features, and wherein the one or more extracted features comprise one or more speech features; determining, using one or more models, a transcription style based on the one or more extracted speech features, wherein the one or more models are trained through use of a feedback loop to weight the one or more extracted speech features such that features having a greater correlation with determined particular transcription styles are weighted greater than other features, and wherein the transcription style specifies a transcription format; transcribing, using the one or more models, the media, according to the determined transcription style, based on the one or more features, and their associated weights, wherein one or more text portions of the transcription are highlighted and bolded based on respective importance values extracted from the one or more features, and wherein an importance value of a text portion of the transcription indicates whether or not a topic of the text portion will be on an exam; notifying the user of the highlighted and bolded transcription in the determined particular style via a device of the user, wherein the notifying is performed according to preferences of the user; and receiving, from the user, confirmation of an accuracy of the transcription and approval of the transcription prior to notifying one or more other users of the transcription. 2. The method of claim 1 , wherein the one or more models correlate the one or more features with an appropriate transcription style and appropriately transcribing the media. 3. The method of claim 1 , further comprising receiving feedback indicative of whether the transcription was accurate; and adjusting the one or more models based on the received feedback. 4. The method of claim 1 , further comprising: collecting training data; extracting training features from the training data; and training the one or more models based on the extracted training features. 5. The method of claim 1 , wherein the transcription style is selected from a group comprising a transcription, outline, summary, presentation with notes, blog with comments, and tutorial with examples. 6. The method of claim 1 , wherein: the user is notified of the transcription along with audio or video of the media; and the transcription notification is synchronized with the audio or video of the media, wherein the synchronization is based on the media's content. 7. The method of claim 1 , wherein the transcription includes one or more timestamps. 8. The method of claim 1 , wherein the transcription is searchable by the user. 9. The method of claim 1 , wherein the one or more features include topics, importance, frequency, vocabulary, tones, moods, pointing, waving, facial expressions, eye direction, and eye movement. 10. A computer program product for transcribing media, the computer program product comprising: one or more non-transitory computer-readable storage media and program instructions stored on the one or more non-transitory computer-readable storage media capable of performing a method, the method comprising: collecting media of a user, wherein the media comprises content of a presentation given by the user; extracting one or more features from the media, wherein the extracting is performed using machine learning techniques comprising a convolutional neural network and long short-term memory to parse the collected media and extract the one or more features, and wherein the one or more extracted features comprise one or more speech features; determining, using one or more models, a transcription style based on the one or more extracted speech features, wherein the one or more models are trained through use of a feedback loop to weight the one or more extracted speech features such that features having a greater correlation with determined particular transcription styles are weighted greater than other features, and wherein the transcription style specifies a transcription format; transcribing, using the one or more models, the media, according to the determined transcription style, based on the one or more features and their associated weights, wherein one or more text portions of the transcription are highlighted and bolded based on respective importance values extracted from the one or more features, and wherein an importance value of a text portion of the transcription indicates whether or not a topic of the text portion will be on an exam; notifying the user of the highlighted and bolded transcription in the determined particular style via a device of the user, wherein the notifying is performed according to preferences of the user; and receiving, from the user, confirmation of an accuracy of the transcription and approval of the transcription prior to notifying one or more other users of the transcription. 11. The computer program product of claim 10 , wherein the one or more models correlate the one or more features with an appropriate transcription style and appropriately transcribing the media. 12. A computer system for transcribing media, the computer system comprising: one or more computer processors, one or more computer-readable storage media, and program instructions stored on the one or more of the computer-readable storage media for execution by at least one of the one or more processors capable of performing a method, the method comprising: collecting media of a user, wherein the media comprises content of a presentation given by the user; extracting one or more features from the media, wherein the extracting is performed using machine learning techniques comprising a convolutional neural network and long short-term memory to parse the collected media and extract the one or more features, and wherein the one or more extracted features comprise one or more speech features; determining, using one or more models, a transcription style based on the one or more extracted speech features, wherein the one or more models are trained through use of a feedback loop to weight the one or more extracted speech features such that features having a greater correlation with determined particular transcription styles are weighted greater than other features, and wherein the transcription style specifies a transcription format; transcribing, using the one or more models, the media, according to the determined transcription style, based on the one or more features and their associated weights, wherein one or more text portions of the transcription are highlighted and bolded based on respective importance-values extracted from the one or more features, and wherein an importance value of a text portion of the transcription indicates whether or not a topic of the text portion will be on an exam; notifying the user of the highlighted and bolded transcription in the determined particular style via a device of the user, wherein the notifying is performed according to preferences of the user; and receiving, from the user, confirmation of an accuracy of the transcription and approval of the transcription prior to notifying one or more other users of the transcription. 13. The computer system of claim 12 , wherein the one or more models correlate the one or more features w
Convolutional networks [CNN, ConvNet] · CPC title
Supervised learning · CPC title
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
Auto-encoder networks; Encoder-decoder networks · CPC title
Training · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.