Training text summarization neural networks with an extracted segments prediction objective
US-10885436-B1 · Jan 5, 2021 · US
US2022180071A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2022180071-A1 |
| Application number | US-202117540768-A |
| Country | US |
| Kind code | A1 |
| Filing date | Dec 2, 2021 |
| Priority date | Dec 4, 2020 |
| Publication date | Jun 9, 2022 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Provided are a system and method for adaptive masking and non-directional language understanding and generation. The system for adaptive masking and non-directional language understanding and generation according to the present invention includes an encoder unit including an adaptive masking block for performing masking on training data, a language generator for restoring masked words, and an encoder for detecting whether or not the restored sentence construction words are original, and a decoder unit including a generation word position detector for detecting a position of a word to be generated next, a language generator for determining a word suitable for the corresponding position, and a non-directional training data generator for decoder training.
Opening claim text (preview).
What is claimed is: 1 . A system for adaptive masking and non-directional language understanding and generation, the system comprising: an encoder unit including an adaptive masking block for performing masking on training data, a language generator for restoring masked words, and an encoder for detecting whether or not the restored sentence construction words are original; and a decoder unit including a generation word position detector for detecting a position of a word to be generated next, a language generator for determining a word suitable for the corresponding position, and a non-directional training data generator for decoder training. 2 . The system of claim 1 , wherein the adaptive masking block performs masking by converting a predetermined ratio of words into a special symbol. 3 . The system of claim 1 , wherein the language generator restores the masked words to obtain a converted input string. 4 . The system of claim 3 , wherein the encoder compares an input string with a converted input string to perform change token prediction. 5 . The system of claim 1 , wherein the decoder unit generates a word by inputting a context, determines a next word generation position by inputting the context and a pre-generated word, generates a next word by inputting the context and pre-generated word to the determined word generation position, and stops a non-directional language generation procedure when the generated word is a sentence termination symbol. 6 . The system of claim 1 , wherein the generation word position detector derives the position of the word to be generated next by inputting a current context and a generated partial result using non-directional training data having a corresponding language generation order. 7 . The system of claim 1 , wherein the non-directional training data generator derives a language generation order that is highly relevant to input context. 8 . The system of claim 1 , wherein the decoder unit performs parallel decoding at a time of language generation. 9 . The system of claim 1 , wherein, when masking is performed, the encoder adjusts a masking ratio by reflecting characteristics of a language generator in which training is in progress. 10 . The system of claim 1 , wherein, as performance of the language generator is improved, noise of an input sentence is maintained at a predetermined ratio or more by increasing a masking probability value for a construction vocabulary.
Natural language generation · CPC title
Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars · CPC title
Learning methods · CPC title
Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation · CPC title
Lexical analysis, e.g. tokenisation or collocates · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.