Intent detection
US-2023136527-A1 · May 4, 2023 · US
US2023083000A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2023083000-A1 |
| Application number | US-202217895818-A |
| Country | US |
| Kind code | A1 |
| Filing date | Aug 25, 2022 |
| Priority date | Aug 27, 2021 |
| Publication date | Mar 16, 2023 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
OCR-text correction system and method embodiments are described. The OCR-text correction embodiments comprise or cooperate with a transformer-based sequence-to-sequence language model. The model is pretrained to denoise corrupted text and is fine-tuned using OCR-correction-specific examples. Text obtained at least in part through OCR is applied to the fine-tuned pretrained transformer model to detect at least one error in a subset of the text. Responsive to detecting the at least one error, the fine-tuned pretrained transformer model outputs an updated subset of the text to correct the at least one error.
Opening claim text (preview).
What is claimed is: 1 . A computer-implemented method, comprising: accessing a machine learning model that is pretrained by a first training dataset, the machine learning model pretrained to perform a non-optical character recognition (non-OCR) task; adjusting the machine learning model using a second training dataset, the second training dataset comprising OCR samples, the machine learning model adjusted to perform an OCR task; receiving a document that includes text obtained at least in part through OCR; applying the adjusted machine learning model to the text to detect at least one error in a subset of the text; and outputting an updated subset of the text to correct the at least one error in the subset of the text. 2 . The method of claim 1 , wherein the pretrained transformer model is bidirectional autoregressive transformer model, the bidirectional autoregressive transformer model including: a bidirectional encoder configured to receive the text; and an autoregressive decoder configured to detect the at least one error in the text and correct the at least one error in the text by predicting original text. 3 . The method of claim 1 , wherein the first training dataset includes one or more of the following: token masking, token deletion, sentence permutation, document rotation, or text infilling. 4 . The method of any of claim 1 , wherein the second training dataset includes monograph and periodical example sentences. 5 . The method of claim 1 , wherein the fine-tuned pretrained transformer model is configured to perform the detection and correction of the at least one error in a single step. 6 . The method of claim 1 , wherein the transformer model is configured to correct the at least one error in the text without being trained on alignment characters. 7 . The method of claim 1 , wherein the first training dataset comprises fewer than 1,000 documents. 8 . The method of claim 1 , wherein the at least one error includes an oversegmentation error caused by incorrectly segmenting a single word into two separate words by OCR. 9 . The method of claim 1 , wherein the at least one error includes an undersegmentation error caused by incorrectly combining a plurality of words into a single word by OCR. 10 . The method of claim 1 , wherein the at least one error includes a misrecognized character error caused by incorrectly recognizing a character by OCR. 11 . The method of claim 1 , wherein the at least one error includes a missing character error caused by incorrectly omitting a character by OCR. 12 . The method of claim 1 , wherein the at least one error includes a hallucination error caused by incorrectly inserting a non-existing character by OCR. 13 . A computer system for detecting and/or correcting text, comprising: a processor; and memory in communication with the processor, the memory configured to store instructions that, when executed by the processor, cause the processor to: access a pretrained transformer model pretrained using a first training dataset; fine-tune the pretrained transformer model using a second training dataset; provide text obtained at least in part through optical character recognition (OCR); apply the text to the fine-tuned pretrained transformer model to detect at least one error in a subset of the text; and output an updated subset of the text by the fine-tuned pretrained transformer model to correct the at least one error in the subset of the text. 14 . The computer system of claim 13 , wherein the pretrained transformer model is bidirectional autoregressive transformer model including: a bidirectional encoder configured to receive the text; and an autoregressive decoder configured to detect the at least one error in the text and correct the at least one error in the text by predicting original text. 15 . The computer system of claim 13 wherein the first training dataset includes one or more of the following: token masking, token deletion, sentence permutation, document rotation, or text infilling. 16 . The computer system of claim 13 , wherein the second training dataset includes monograph and periodical example sentences. 17 . The computer system of claim 13 , wherein the fine-tuned pretrained transformer model is configured to perform the detection and correction of the at least one error in a single step. 18 . The computer system of claim 13 , wherein the transformer model is configured to correct the at least one error in the text without being trained on alignment characters. 19 . The computer system of claim 13 , wherein the first training dataset comprises fewer than 1,000 documents. 20 . A non-transitory computer readable storage medium configured to store code comprising instructions, wherein the instructions, when executed by a processor, cause the processor to: access a pretrained transformer model pretrained using a first training dataset; fine-tune the pretrained transformer model using a second training dataset; provide text obtained at least in part through optical character recognition (OCR); apply the text to the fine-tuned pretrained transformer model to detect at least one error in a subset of the text; and output an updated subset of the text by the fine-tuned pretrained transformer model to correct the at least one error in the subset of the text.
Techniques for post-processing, e.g. correcting the recognition result · CPC title
Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
Evaluation of quality of the acquired characters · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.