Who is the assignee on this patent?

Baidu online network technology beijing co ltd, Baidu Online Network Tech Co Ltd

What technology area does this patent fall under?

Primary CPC classification G10L17/14. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Oct 19 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method, and device for matching speech with text, and computer-readable storage medium

Patent metadata
Field	Value
Publication number	US-11152007-B2
Application number	US-201916543155-A
Country	US
Kind code	B2
Filing date	Aug 16, 2019
Priority date	Dec 7, 2018
Publication date	Oct 19, 2021
Grant date	Oct 19, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments of a method and device for matching a speech with a text, and a computer-readable storage medium are provided. The method can include: acquiring a speech identification text by identifying a received speech signal; comparing the speech identification text with multiple candidate texts in a first matching mode to determine a first matching text; and comparing phonetic symbols of the speech identification text with phonetic symbols of the multiple candidate texts in a second matching mode to determine a second matching text, in a case that no first matching text is determined.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for matching a speech with a text, comprising: acquiring a speech identification text by identifying a received speech signal; comparing the speech identification text with multiple candidate texts in a first matching mode to determine a first matching text; and comparing phonetic symbols of the speech identification text with phonetic symbols of the multiple candidate texts in a second matching mode to determine a second matching text, in response to not determining the first matching text, wherein comparing phonetic symbols of the speech identification text with phonetic symbols of the multiple candidate texts in the second matching mode to determine the second matching text comprises: converting the speech identification text into the phonetic symbols of the speech identification text and converting the multiple candidate texts into the phonetic symbols of the multiple candidate texts; calculating a similarity between the phonetic symbols of the speech identification text and the phonetic symbols of each of the multiple candidate texts; and determining a candidate text with a largest similarity as a matched candidate text in response to determining that the largest similarity is larger than a set threshold; and outputting the matched candidate text, wherein calculating the similarity between the phonetic symbols of the speech identification text and the phonetic symbols of each of the multiple candidate texts is by the following formula: similarity = LCS ⁡ ( s , q ) len ⁡ ( s ) wherein s represents phonetic symbols of one of the multiple candidate texts, q represents the phonetic symbols of the speech identification text, LCS(s, q) represents a length of a longest common sequence between the phonetic symbols of the one of the multiple candidate texts and the phonetic symbols of the speech identification text, len(s) represents a length of the phonetic symbols of the one of the multiple candidate texts. 2. The method according to claim 1 , further comprising: outputting the first matching text as a matched candidate text, in response to determining the first matching text; and outputting the second matching text as the matched candidate text, in response to determining the second matching text. 3. The method according to claim 1 , further comprising: calculating a similarity between a sentence vector of the speech identification text and a sentence vector of each of the multiple candidate texts, in response to not determining the second matching text; and outputting a candidate text with a largest similarity as a matched candidate text. 4. The method according to claim 3 , wherein the calculating a similarity between a sentence vector of the speech identification text and a sentence vector of each of the multiple candidate texts comprises: segmenting the speech identification text and the multiple candidate texts into words; acquiring a word vector of each word; adding word vectors of words of the speech identification text to obtain the sentence vector of the speech identification text, and adding word vectors of words of one of the multiple candidate texts to acquire a sentence vector of the one of the multiple candidate texts; and calculating a cosine similarity between the sentence vector of the speech identification text and the sentence vector of the one of the multiple candidate texts, as the similarity between the sentence vector of the speech identification text and the sentence vector of the one of the multiple candidate texts. 5. A device for matching a speech with a text, comprising: one or more processors; and a storage device configured to store one or more programs, that, when executed by the one or more processors, cause the one or more processors to: acquire a speech identification text by identifying a received speech signal; compare the speech identification text with multiple candidate texts in a first matching mode to determine a first matching text; and compare phonetic symbols of the speech identification text with phonetic symbols of the multiple candidate texts in a second matching mode to determine a second matching text, in response to not determining the first matching text, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors further to: convert the speech identification text into the phonetic symbols of the speech identification text and convert the multiple candidate texts into the phonetic symbols of the multiple candidate texts; calculate a similarity between the phonetic symbols of the speech identification text and the phonetic symbols of each of the multiple candidate texts; determine a candidate text with a largest similarity as a matched candidate text in response to determining that the largest similarity is larger than a set threshold; and output the matched candidate text, wherein the similarity between the phonetic symbols of the speech identification text and the phonetic symbols of each of the multiple candidate texts is calculated by the following formula: similarity = LCS ⁡ ( s , q ) len ⁡ ( s ) wherein s represents phonetic symbols of one of the multiple candidate texts, q represents the phonetic symbols of the speech identification text, LCS(s, q) represents a length of a longest common sequence between the phonetic symbols of one of the multiple candidate texts and the phonetic symbols of the speech identification text, len(s) represents a length of the phonetic symbols of the one of the multiple candidate texts. 6. The device according to claim 5 , wherein the one or more programs, when executed by the one or more processors, cause the one or more processors further to: output the first matching text as a matched candidate text, in response to determining the first matching text; and output the second matching text as the matched candidate text, in response to determining the second matching text. 7. The device according to claim 5 , wherein the one or more programs, when executed by the one or more processors, cause the one or more processors further to: calculate a similarity between a sentence vector of the speech identification text and a sentence vector of each of the multiple candidate texts, in response to not determining the second matching text; and output a candidate text with a largest similarity as a matched candidate text. 8. The device according to claim 7 , where

Assignees

Inventors

Lu Yongshuai

Classifications

G10L17/14Primary
Use of phonemic categorisation or speech recognition prior to speaker recognition or verification · CPC title
G10L15/10Primary
using distance or distortion measures between unknown speech and reference templates · CPC title
G06F18/22
Matching criteria, e.g. proximity measures · CPC title
G10L15/187
Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams · CPC title
G06F40/20
Natural language analysis (semantic analysis of natural language G06F40/30) · CPC title

Patent family

Related publications grouped by family.

View patent family 66113079

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11152007B2 cover?: Embodiments of a method and device for matching a speech with a text, and a computer-readable storage medium are provided. The method can include: acquiring a speech identification text by identifying a received speech signal; comparing the speech identification text with multiple candidate texts in a first matching mode to determine a first matching text; and comparing phonetic symbols of the …
Who is the assignee on this patent?: Baidu online network technology beijing co ltd, Baidu Online Network Tech Co Ltd
What technology area does this patent fall under?: Primary CPC classification G10L17/14. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Oct 19 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).