Arabic spell checking technique
US-9037967-B1 · May 19, 2015 · US
US2023326466A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2023326466-A1 |
| Application number | US-202118043514-A |
| Country | US |
| Kind code | A1 |
| Filing date | Aug 24, 2021 |
| Priority date | Aug 31, 2020 |
| Publication date | Oct 12, 2023 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Provided are a text processing method and apparatus, an electronic device, and a medium. The method includes the following: target text information generated based on audio information is acquired; a to-be-error-corrected word in the target text information and a target candidate replacement word corresponding to the to-be-error-corrected word are determined; and a target replacement word corresponding to the to-be-error-corrected word is determined according to the target candidate replacement word, and the target text information is updated based on the target replacement word.
Opening claim text (preview).
1 . A text processing method, comprising: acquiring target text information generated based on audio information; determining a to-be-error-corrected word in the target text information and a target candidate replacement word corresponding to the to-be-error-corrected word; and determining, according to the target candidate replacement word, a target replacement word corresponding to the to-be-error-corrected word, and updating the target text information based on the target replacement word. 2 . The method according to claim 1 , before acquiring the target text information generated based on the audio information, further comprising: collecting the audio information of a speaker and converting the audio information to corresponding text information; and generating, according to the text information, a speech timestamp corresponding to the speaker, and an identifier of the speaker, current text content displayed on a client, and determining the target text information based on the current text content. 3 . The method according to claim 2 , wherein acquiring the target text information generated based on the audio information comprises: determining a timestamp of text content without error correction among all text content, and acquiring text content without error correction within a preset duration based on the timestamp; and determining the target text information based on the text content without error correction within the preset duration. 4 . The method according to claim 3 , wherein the all text content is determined based on text information displayed in a preset region of the client, or the all text content is retrieved from a speech-to-text module. 5 . The method according to claim 1 , wherein determining the to-be-error-corrected word in the target text information and the target candidate replacement word corresponding to the to-be-error-corrected word comprises: determining the to-be-error-corrected word in the target text information and the target candidate replacement word corresponding to the to-be-error-corrected word in a correction manner corresponding to a correction type, to determine the target replacement word based on the target candidate replacement word, wherein the error correction type comprises a type of text-pronunciation-based error correction and a type of text-content-based error correction. 6 . The method according to claim 5 , wherein the error correction type comprises the type of text-pronunciation-based error correction, and adopting the error correction manner corresponding to the error correction type to determine the to-be-error-corrected word in the target text information and the target candidate replacement word corresponding to the to-be-error-corrected word comprises: acquiring a pronunciation of each piece of text in the target text information; determining, according to the pronunciation of the each piece of text and hot words pre-stored in a hot word dictionary, whether a target hot word corresponding to the pronunciation of the each piece of text exists in the target text information, wherein the hot word dictionary is configured to store a plurality of hot words, and the plurality of hot words are determined based on audio information and text information that are collected in a real-time interactive process; and in response to the target hot word corresponding to the pronunciation of the each piece of text existing in the target text information, determining the to-be-error-corrected word in the target text information according to the target hot word, and determining the target candidate replacement word based on the to-be-error-corrected word. 7 . The method according to claim 6 , wherein the type of text-pronunciation-based error correction comprises a type of pinyin-based error correction. 8 . The method according to claim 6 , wherein determining, according to the target candidate replacement word, the target replacement word corresponding to the to-be-error-corrected word, and updating the target text information based on the target replacement word comprises: acquiring a first to-be-processed sentence to which the to-be-error-corrected word belongs in the target text information, and updating the first to-be-processed sentence based on the target candidate replacement word to acquire a second to-be-processed sentence; determining a perplexity value of the second to-be-processed sentence; in response to the perplexity value being greater than or equal to a preset perplexity threshold, determining the target replacement word according to the to-be-error-corrected word; and in response to the perplexity value being less than the preset perplexity threshold, determining the target replacement word according to the target candidate replacement word. 9 . The method according to claim 5 , wherein the error correction type comprises the type of text-content-based error correction, and adopting the error correction manner corresponding to the error correction type to determine the to-be-error-corrected word in the target text information and the target candidate replacement word corresponding to the to-be-error-corrected word so as to determine the target replacement word based on the target candidate replacement word comprises at least one of: performing matching for text content in the target text information based on a pre-determined confusion word lexicon to determine a confusion word in the target text information, and determining the to-be-error-corrected word and the corresponding target candidate replacement word based on the confusion word; or determining a suspected to-be-error-corrected word in the target text information based on a pre-determined common word lexicon, and determining the to-be-error-corrected word and the corresponding target candidate replacement word based on the suspected to-be-error-corrected word. 10 . The method according to claim 9 , wherein performing the matching for the text content in the target text information based on the pre-determined confusion word lexicon to determine the confusion word in the target text information, and determining the to-be-error-corrected word and the corresponding target candidate replacement word based on the confusion word comprises: determining the confusion word in the target text information based on the pre-determined confusion word lexicon, and taking a word in the pre-determined confusion word lexicon corresponding to the confusion word as the target candidate replacement word; and determining, according to the target candidate replacement word, the target replacement word corresponding to the to-be-error-corrected word, and updating the target text information based on the target replacement word comprises: determining the target replacement word according to the target candidate replacement word; and updating, based on the target replacement word, the to-be-error-corrected word in the target text information corresponding to the target replacement word. 11 . The method according to claim 9 , before determining the suspected to-be-error-corrected word in the target text information based on the pre-determined common word lexicon, and determining the to-be-error-corrected word and the corresponding target candidate replacement word based on the suspected to-be-error-corrected word, further comprising: segmenting a sentence in the target text information to acquire at least one key word to determine the to-be-error-corrected word from the at least one key word. 12 . The method according to claim 11 , wherein determining the to-be-error-corrected word in the target text information and the target candidate replacement word cor
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
Pattern transformations or operations aimed at increasing system robustness, e.g. against channel noise or different working conditions · CPC title
Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction · CPC title
Use of phonemic categorisation or speech recognition prior to speaker recognition or verification · CPC title
Orthographic correction, e.g. spell checking or vowelisation · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.