Machine learning collaboration techniques
US-2024420212-A1 · Dec 19, 2024 · US
US2020364576A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2020364576-A1 |
| Application number | US-201916411490-A |
| Country | US |
| Kind code | A1 |
| Filing date | May 14, 2019 |
| Priority date | May 14, 2019 |
| Publication date | Nov 19, 2020 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present disclosure relates to utilizing a deep recurrent neural network for accurately performing punctuation restoration. For example, the disclosed systems can provide a sequence of words to a punctuation restoration neural network having multiple bi-directional recurrent layers and one or more neural attention mechanisms. In one or more embodiments, the punctuation restoration neural network incorporates layer-wise attentions and/or multi-head attention. The disclosed systems can utilize the punctuation restoration neural network to generate probabilities for each word, indicating the likelihood that each possible punctuation mark is associated with that word. Based on these probabilities, the disclosed systems can generate a punctuated transcript that includes punctuation before the appropriate words.
Opening claim text (preview).
What is claimed is: 1 . A non-transitory computer-readable medium storing instructions thereon that, when executed by at least one processor, cause a computing device to: generate, by each bi-directional recurrent neural network layer of a plurality of bi-directional recurrent neural network layers, a plurality of output states corresponding to words from a sequence of words; generate, utilizing one or more neural attention mechanisms, a plurality of attention outputs based on the plurality of output states; determine punctuation label probabilities for the words from the sequence of words based on the plurality of output states and the plurality of attention outputs; and generate a punctuated transcript comprising punctuation before one or more of the words from the sequence of words based on the punctuation label probabilities. 2 . The non-transitory computer-readable medium of claim 1 , wherein: the one or more neural attention mechanisms comprise a neural attention mechanism for each bi-directional recurrent neural network layer; and the instructions, when executed by the at least one processor, cause the computing device to generate, by each neural attention mechanism, a layer-wise attention weight for each output state from the plurality of output states of a corresponding bi-directional recurrent neural network layer. 3 . The non-transitory computer-readable medium of claim 2 , wherein the instructions, when executed by the at least one processor, cause the computing device to generate the plurality of attention outputs by concatenating, for a given state, the layer-wise attention weight corresponding to the state from each neural attention mechanism. 4 . The non-transitory computer-readable medium of claim 3 , wherein: each neural attention mechanism from the one or more neural attention mechanisms comprises a multi-head neural attention mechanism; the instructions, when executed by the at least one processor, cause the computing device to: utilize each multi-head neural attention mechanism to generate a plurality of layer-wise attention weights for each state; and generate the plurality of attention outputs by concatenating, for a given state, the plurality of layer-wise attention weights corresponding to the state from each neural attention mechanism. 5 . The non-transitory computer-readable medium of claim 1 , wherein: the one or more neural attention mechanisms comprise a multi-head neural attention mechanism; the instructions, when executed by the at least one processor, cause the computing device to: utilize the multi-head neural attention mechanism to generate a plurality of attention weights for each state; and generate the plurality of attention outputs by concatenating, for a given state, the plurality of attention weights corresponding to the state. 6 . The non-transitory computer-readable medium of claim 1 , wherein the one or more neural attention mechanisms comprise one or more scaled dot-product neural attention mechanisms. 7 . The non-transitory computer-readable medium of claim 1 , further storing instructions that, when executed by the at least one processor, cause the computing device to generate a set of final states based on the plurality of output states. 8 . The non-transitory computer-readable medium of claim 7 , wherein the instructions, when executed by the at least one processor, cause the computing device to determine the punctuation label probabilities for the words from the sequence of words based on the set of final states and the plurality of attention outputs utilizing a fully connected layer with a SoftMax classifier to generate, for a given word of the sequence of words, a punctuation label probability for each of a plurality of punctuation marks. 9 . The non-transitory computer-readable medium of claim 1 , wherein the instructions, when executed by the at least one processor, cause the computing device to generate, by a given bi-directional recurrent neural network layer, the plurality of output states by: generating a plurality of forward states by processing embeddings of the sequence of words in a forward direction utilizing a forward recurrent neural network layer of the given bi-directional recurrent neural network layer; generating a plurality of backward states by processing the embeddings of the sequence of words in a backward direction utilizing a backward recurrent neural network layer of the given bi-directional recurrent neural network layer; and combining, for each state, a forward state and a backward state corresponding to the state. 10 . The non-transitory computer-readable medium of claim 1 , further storing instructions that, when executed by the at least one processor, cause the computing device to perform a language understanding task based on the punctuated transcript, the language understanding task comprising at least one of generating a translation, generating a transcript summary, determining an answer to a question, performing sentiment analysis, performing syntactic parsing, or extracting information. 11 . A system comprising: a memory comprising a punctuation restoration neural network trained to generate punctuation label probabilities, the punctuation restoration neural network comprising a plurality of bi-directional recurrent neural network layers and one or more neural attention mechanisms; at least one processor; and at least one non-transitory computer-readable medium storing instructions thereon that, when executed by the at least one processor, cause the system to: generate, by each bi-directional recurrent neural network layer of the punctuation restoration neural network, a plurality of output states by generating forward states and backward states and combining the forward states and backward states, wherein each state corresponds to words from a sequence of words; generate a set of final states based on the plurality of output states utilizing a gated recurrent unit of the punctuation restoration neural network; generate, utilizing one or more neural attention mechanisms, a plurality of attention outputs by combining the plurality of output states from each bi-directional layer and the set of final states; determine punctuation label probabilities for the words from the sequence of words based on the set of final states and the plurality of attention outputs; and generate a punctuated transcript comprising punctuation before one or more of the words from the sequence of words based on the punctuation label probabilities. 12 . The system of claim 11 , wherein: the one or more neural attention mechanisms comprise a plurality of neural attention mechanisms; each neural attention mechanism from the plurality of neural attention mechanisms corresponds to a bi-directional recurrent neural network layer from the plurality of bi-directional recurrent neural network layers; and the instructions, when executed by the at least one processor, cause the system to generate, by each neural attention mechanism, a layer-wise attention weight for each output state from the plurality of output states of a corresponding bi-directional recurrent neural network layer. 13 . The system of claim 12 , wherein: each neural attention mechanism from the plurality of neural attention mechanisms comprises a multi-head neural attention mechanism; the instructions, when executed by the at least one processor, cause the system to: utilize each multi-head neural attention mechanism to generate a plurality of layer-wise attention weights for each state; and generate the plurality of attention outputs by combining, for a given state, the pluralit
Related publications grouped by family.
Answers are generated from the same data shown on this page.