Speech recognition method and apparatus, and storage medium

US2019385599A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2019385599-A1
Application numberUS-201916547097-A
CountryUS
Kind codeA1
Filing dateAug 21, 2019
Priority dateJun 29, 2017
Publication dateDec 19, 2019
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A speech recognition method is provided. The method includes: obtaining a voice signal; processing the voice signal according to a speech recognition algorithm to obtain n candidate recognition results, the candidate recognition results including text information corresponding to the voice signal; identifying a target result from among the n candidate recognition results according to a selection rule selected from among m selection rules, the selection rule having an execution sequence of j, the target result being a candidate recognition result that has a highest matching degree with the voice signal in the n candidate recognition results, an initial value of j being 1; and identifying the target result from among the n candidate recognition results according to a selection rule having an execution sequence of j+1 based on the target result not being identified according to the selection rule having the execution sequence of j.

First claim

Opening claim text (preview).

What is claimed is: 1 . A speech recognition method, performed by at least one processor, comprising: obtaining, by the at least one processor, a voice signal; processing, by the at least one processor, the voice signal according to a speech recognition algorithm to obtain n candidate recognition results, the candidate recognition results comprising text information corresponding to the voice signal, and n being an integer greater than 1; identifying, by the at least one processor, a target result from among the n candidate recognition results according to a selection rule selected from among m selection rules, the selection rule having an execution sequence of j, the target result being a candidate recognition result that has a highest matching degree with the voice signal in the n candidate recognition results, m being an integer greater than 1, and an initial value of j being 1; and identifying, by the at least one processor, the target result from among the n candidate recognition results according to a selection rule having an execution sequence of j+1 based on the target result not being identified according to the selection rule having the execution sequence of j. 2 . The speech recognition method according to claim 1 , further comprising identifying, by the at least one processor, execution sequences of the m selection rules according to respective algorithm complexity degrees, wherein the execution sequences and the algorithm complexity degrees have a positive correlation. 3 . The speech recognition method according to claim 1 , wherein the m selection rules comprise at least two selected from among a command selection rule, a function selection rule, and a dialogue selection rule, wherein an algorithm complexity degree of the command selection rule is lower than an algorithm complexity degree of the function selection rule, and the algorithm complexity degree of the function selection rule is lower than an algorithm complexity degree of the dialogue selection rule, wherein the command selection rule is used for instructing a speech recognition device to detect, depending on whether a command lexicon comprises a command keyword matching an i th candidate recognition result, whether the i th candidate recognition result is the target result, i being an integer, and 1≤i≤n, wherein the function selection rule is used for instructing the speech recognition device to detect, depending on whether a voice lexicon comprises a lexicon keyword matching a voice keyword, whether the i th candidate recognition result is the target result, the voice keyword being at least one keyword in the i th candidate recognition result, and wherein the dialogue selection rule is used for instructing the speech recognition device to identify a similarity degree between each candidate recognition result and the voice signal according to a trained language model, to select the target result. 4 . The speech recognition method according to claim 3 , wherein the selection rule having the execution sequence of j comprises the command selection rule, and the identifying the target result comprises: detecting whether a first correspondence of the command lexicon comprises the command keyword matching the i th candidate recognition result; and identifying, based on the first correspondence comprising the command keyword matching the i th candidate recognition result, the i th candidate recognition result as the target result, wherein the first correspondence comprises at least the command keyword. 5 . The speech recognition method according to claim 4 , wherein after the detecting whether the first correspondence of the command lexicon comprises the command keyword matching the i th candidate recognition result, the method further comprises: detecting, by the at least one processor and based on the first correspondence not comprising a command keyword matching any of the n candidate recognition results, whether a second correspondence of the command lexicon comprises a keyword matching any word in the i th candidate recognition result; searching, by the at least one processor and based on the second correspondence comprising a keyword matching a word in the i th candidate recognition result, according to an index value corresponding to the keyword in the second correspondence, the first correspondence for a command keyword corresponding to the index value; identifying, by the at least one processor, an edit distance between the i th candidate recognition result and the command keyword, the edit distance indicating a quantity of operations required for conversion of the i th candidate recognition result into the command keyword; and identifying, by the at least one processor and based on the edit distance being less than a preset value, the i th candidate recognition result as the target result, wherein the first correspondence comprises a correspondence between the index value and the command keyword, and the second correspondence comprises a correspondence between the index value and the keyword. 6 . The speech recognition method according to claim 3 , wherein the selection rule having the execution sequence of j comprises the function selection rule, and the identifying the target result in the n candidate recognition results according to the selection rule having the execution sequence of j comprises: analyzing, by the at least one processor, a function template of the i th candidate recognition result; detecting, by the at least one processor, whether the voice lexicon comprises the lexicon keyword matching the voice keyword in the i th candidate recognition result; and identifying, by the at least one processor and based on the voice lexicon comprising the lexicon keyword matching the voice keyword in the i th candidate recognition result, the i th candidate recognition result as the target result, wherein the i th candidate recognition result comprises the function template and the voice keyword. 7 . The speech recognition method according to claim 3 , wherein the selection rule having the execution sequence of j comprises the dialogue selection rule, and the identifying the target result in the n candidate recognition results according to the selection rule having the execution sequence of j comprises: calculating, by the at least one processor, a perplexity of each candidate recognition result according to the language model; identifying, by the at least one processor, a smallest value of the perplexities in the n candidate recognition results and identifying the i th candidate recognition result corresponding to the smallest value as the target result, wherein the perplexities are used for indicating the similarity degrees between the candidate recognition results and the voice signal, the perplexities and the similarity degrees have a negative correlation, the language model is an N-gram language model that is generated according to a dedicated corpus corresponding to at least one field, the N-gram language model is used for identifying an occurrence probability of a current word according to occurrence probabilities of N−1 words before the current word, and N is a positive integer. 8 . A speech recognition apparatus, comprising: at least one memory configured to store computer program code; and at least one processor configured to access the at least one memory and operate as instructed by the computer program code, the computer program code including: signal obtaining code configured to cause the at least one processor to obtain a voice signal; speech recognition code configured to cause the at least one processor to process, using a speech recognition algorithm, the voice signal, to obtain n candidate recogniti

Assignees

Inventors

Classifications

  • Execution procedure of a spoken command · CPC title

  • using artificial neural networks · CPC title

  • G10L15/22Primary

    Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

  • Word spotting · CPC title

  • G10L15/197Primary

    Probabilistic grammars, e.g. word n-grams · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2019385599A1 cover?
A speech recognition method is provided. The method includes: obtaining a voice signal; processing the voice signal according to a speech recognition algorithm to obtain n candidate recognition results, the candidate recognition results including text information corresponding to the voice signal; identifying a target result from among the n candidate recognition results according to a selectio…
Who is the assignee on this patent?
Tencent Tech Shenzhen Co Ltd
What technology area does this patent fall under?
Primary CPC classification G10L15/22. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Dec 19 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).