Automatic post-editing model for neural machine translation
US-2021019373-A1 · Jan 21, 2021 · US
US11735184B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11735184-B2 |
| Application number | US-202016937349-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 23, 2020 |
| Priority date | Jul 24, 2019 |
| Publication date | Aug 22, 2023 |
| Grant date | Aug 22, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A speech recognition method including performing speech recognition on an inputted speech to obtain a first text, correcting the first text according to an obtained mapping relationship between words in different languages to obtain at least one second text, and in response to determining that the at least one second text corresponds to the same language, outputting the first text, or in response to determining that the at least one second text corresponds to different languages, determine an outputted text according to first probability values corresponding to each of the at least one second text. By combining the mapping relationships between words in different languages in correcting the initial ASR result, the present application ensures the accuracy of the final speech recognition result.
Opening claim text (preview).
What is claimed is: 1. An apparatus comprising: one or more processors; and one or more memories storing thereon computer-readable instructions that, when executed by the one or more processors, cause the one or more processors to perform acts comprising: initializing parameters of a machine translation model according to parameters of a language model; training the machine translation model using training samples to obtain a trained machine translation model; performing speech recognition on an inputted speech to obtain a first text; correcting, by inputting the first text into the trained machine translation model, the first text according to a mapping relationship between words in different languages to obtain at least one second text; obtaining respective first probability values predicted by the trained machine translation model corresponding to respective second texts of the at least one second text; and determining an output text at least according to the respective first probability values corresponding to the respective second texts of the at least one second text, a respective first probability value representing a probability that the first text is corrected to a respective second text in the at least one second text, the determining the output text at least according to the respective first probability values corresponding to the respective second texts including: inputting the at least one second text into the language model to determine respective second probability values corresponding to the respective second texts of the at least one second text using the language model, a respective second probability value representing a reasonableness of grammar and semantics of the respective second text; determining the output text according to the respective first probability values and the respective second probability values corresponding to the respective second texts; in response to determining that the first text is consistent with a particular second text having a largest summed probability value, outputting the first text, a respective summed probability value of the respective second text representing a weighted sum of the respective first probability value and the respective second probability value corresponding to the respective second text; and in response to determining that the first text is inconsistent with the particular second text having the largest summed probability value, outputting the particular second text having the largest summed probability value. 2. The apparatus according to claim 1 , wherein the training the machine translation model using the training samples to obtain the trained machine translation model comprises: acquiring a speech sample containing a plurality of languages; performing speech recognition on the speech sample to obtain a plurality of text candidates; forming a training sample from annotated texts corresponding to the plurality of text candidates and the speech sample; and training the machine translation model using the training sample to obtain the trained machine translation model. 3. The apparatus according to claim 2 , wherein the correcting the first text comprises: inputting the first text into the trained machine translation model; and correcting the first text using the trained machine translation model. 4. The apparatus according to claim 1 , wherein: the machine translation model is composed of an encoder and a decoder; and the encoder or the decoder includes any one of neural network models including: a recurrent neural network model, a long short-term memory network model, and a bidirectional long short-term memory network model. 5. The apparatus according to claim 2 , wherein the acts further comprise: acquiring corpus samples corresponding to each of the plurality of languages; and training the language model using the corpus samples corresponding to each of the plurality of languages. 6. The method according to claim 1 , further comprising predicting the respective first probability values using the trained machine translation model. 7. The apparatus according to claim 1 , wherein: the mapping relationship between words in different languages comprises a mapping relationship between words in different dialects of a same language; and the at least one second text corresponds to the same language refers to that the at least one second text corresponds to a same dialect of the same language. 8. The apparatus according to claim 1 , wherein the acts further comprise: in response to determining that the at least one second text includes a word not corresponding to the first language, determining a target second text according to the respective first probability values corresponding to each of the at least one second text; and translating the target second text into a second language. 9. A method comprising: initializing parameters of a machine translation model according to parameters of a language model; training the machine translation model using training samples to obtain a trained machine translation model; performing speech recognition on an inputted speech to obtain a first text; correcting, by inputting the first text into the trained machine translation model, the first text according to a mapping relationship between words in different languages to obtain at least one second text; obtaining respective first probability values predicted by the trained machine translation model corresponding to respective second texts of the at least one second text; and determining an output text at least according to the respective first probability values corresponding to the respective second texts of the at least one second text, a respective first probability value representing a probability that the first text is corrected to a respective second text in the at least one second text, the determining the output text at least according to the respective first probability values corresponding to the respective second texts including: inputting the at least one second text into the language model to determine respective second probability values corresponding to the respective second texts of the at least one second text using the language model, a respective second probability value representing a reasonableness of grammar and semantics of the respective second text; determining the output text according to the respective first probability values and the respective second probability values corresponding to the respective second texts; in response to determining that the first text is consistent with a particular second text having a largest summed probability value, outputting the first text, a respective summed probability value of the respective second text representing a weighted sum of the respective first probability value and the respective second probability value corresponding to the respective second text; and in response to determining that the first text is inconsistent with the particular second text having the largest summed probability value, outputting the particular second text having the largest summed probability value. 10. The method according to claim 9 , wherein the training the machine translation model using the training samples to obtain the trained machine translation model comprises: acquiring a speech sample containing a plurality of languages; performing speech recognition on the speech sample to obtain a plurality of text candidates; forming a training sample from annotated texts corresponding to the plurality of text candidates and the speech sample; and training the machine translation model using the training sample to obtain the trained machine translation mode
Speech to text systems (G10L15/08 takes precedence) · CPC title
using context dependencies, e.g. language models · CPC title
Language recognition · CPC title
Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning · CPC title
Data-driven translation · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.