What technology area does this patent fall under?

Primary CPC classification G10L15/187. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Sep 03 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Automatic speech recognition with detection of at least one contextual element, and application management and maintenance of aircraft

US10403274B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10403274-B2
Application number	US-201615264722-A
Country	US
Kind code	B2
Filing date	Sep 14, 2016
Priority date	Sep 15, 2015
Publication date	Sep 3, 2019
Grant date	Sep 3, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An automatic speech recognition with detection of at least one contextual element, and application to aircraft flying and maintenance are provided. The automatic speech recognition device comprises a unit for acquiring an audio signal, a device for detecting the state of at least one contextual element, and a language decoder for determining an oral instruction corresponding to the audio signal. The language decoder comprises at least one acoustic model defining an acoustic probability law and at least two syntax models each defining a syntax probability law. The language decoder also comprises an oral instruction construction algorithm implementing the acoustic model and a plurality of active syntax models taken from among the syntax models, a contextualization processor to select, based on the state of the order each contextual element detected by the detection device, at least one syntax model selected from among the plurality of active syntax models, and a processor for determining the oral instruction corresponding to the audio signal.

First claim

Opening claim text (preview).

What is claimed is: 1. An automatic speech recognition device comprising: an acquisition unit for acquiring an audio signal, a forming member for forming the audio signal, to divide the audio signal into frames, a detection device, and a language decoder for determining an oral instruction corresponding to the audio signal, the detection device being a gaze detector configured to detect which of a plurality states is represented by a direction of a user's gaze and/or a pointing detector configured to detect which of a plurality states is represented by a position of a pointing member, the language decoder comprising: at least one acoustic model defining an acoustic probability law for calculating, for each phoneme of a sequence of phonemes, an acoustic probability of that phoneme and a corresponding frame of the audio signal matching; at least two different syntax models, each of the syntax models being associated with a respective one of the states of the direction of the user's gaze detected by the gazed detector and/or one of the states of the position of the pointing member detected by the pointing detector or a respective combination of the states, each of the syntax models being definable as active or inactive, each of the active syntax models defining a different respective syntax probability law for calculating, for each phoneme of a sequence of phonemes analyzed using said acoustic model, a different respective syntax probability of that phoneme following the phoneme or group of phonemes preceding said phoneme in the sequence of phonemes; an oral instruction construction algorithm implementing the acoustic model and a plurality of the active syntax models from among the syntax models to build, for each active syntax model, a candidate sequence of phonemes associated with said active syntax model so that the product of the acoustic and the respective different syntax probabilities of the different phonemes making up said candidate sequence of phonemes is maximal; a contextualization processor to select at least one syntax model selected from among the plurality of active syntax models based on the state of the direction of the user's gaze detected by the gazed detector and/or the state of the position of the pointing member detected by the pointing detector; and a determination processor for determining the oral instruction corresponding to the audio signal, to define the candidate sequence of phonemes associated with the selected syntax model or, if several syntax models are selected, the sequence of phonemes, from among the candidate sequences of phonemes associated with the selected acoustic models, for which the product of the acoustic and syntax probabilities of different phonemes making up said sequence of phonemes is maximal, as constituting the oral instruction corresponding to the audio signal. 2. The automatic speech recognition device according to claim 1 , wherein the contextualization processor is configured for: assigning, based on the state of the direction of the user's gaze detected by the gazed detector and/or the state of the position of the pointing member detected by the pointing detector, an order number to each active syntax model, seeking, among the active syntax models, candidate syntax models with which candidate sequences of phonemes are associated for which the product of the acoustic and syntax probabilities of the different phonemes making up said candidate sequences of phonemes is above a predetermined threshold, and selecting the candidate syntax model(s) having the highest order number. 3. The automatic speech recognition device according to claim 1 , wherein the pointing member is a cursor. 4. The automatic speech recognition device as recited in claim 1 wherein the contextualization processor is configured for: assigning, based on the state of the direction of the user's gaze detected by the gazed detector and/or the state of the position of the pointing member detected by the pointing detector, an order number to each active syntax model, seeking, among the active syntax models, candidate syntax models with which candidate sequences of phonemes are associated for which the product of the acoustic and syntax probabilities of the different phonemes making up said candidate sequences of phonemes is above a predetermined threshold, and selecting the candidate syntax model(s) having the highest order number, the automatic speech recognition device further comprising a display device displaying objects, each syntax model being associated with a respective object from among the displayed objects, the contextualization processor being configured for assigning an order number thereof to each syntax model based on the distance between the direction of the user's gaze or the position of the pointer and the displayed object with which said syntax model is associated. 5. An assistance system to assist with the piloting or maintenance of an aircraft, comprising: the automatic speech recognition device according to claim 1 ; and a command execution unit configured to execute the oral instruction corresponding to the audio signal. 6. An automatic speech recognition method comprising: determining an oral instruction corresponding to an audio signal, the determining of the oral instruction being implemented by an automatic speech recognition device comprising: at least one acoustic model defining an acoustic probability law for calculating, for each phoneme of a sequence of phonemes, an acoustic probability of that phoneme and a corresponding frame of the audio signal matching, at least two different syntax models, each of the syntax models being associated with a respective state of a direction of a user's gaze and/or of a position of a pointing member or a respective combination of the states, each of the syntax models being definable as active or inactive, each of the active syntax models defining a different respective syntax probability law for calculating, for each phoneme of a sequence of phonemes analyzed using said acoustic model, a different respective syntax probability of that phoneme following the phoneme or group of phonemes preceding said phoneme in the sequence of phonemes, wherein the determining of the oral instruction comprises: acquiring the audio signal, detecting a detected state represented by a direction of a user's gaze and/or by a position of a pointing member, activating a plurality of syntax models forming active syntax models, forming the audio signal, said forming comprising dividing the audio signal into frames, building, for each active syntax model, using the acoustic model and said active syntax model, a candidate sequence of phonemes associated with said active syntax model so that the product of the acoustic and the respective different syntax probabilities of the different phonemes making up said candidate sequence of phonemes is maximal, selecting at least one syntax model from among the active syntax models based on the detected state of the direction of the user's gaze and/or the detected state of the position of the pointing member, and defining the candidate sequence of phonemes associated with the selected syntax model or, if several syntax models are selected, the sequence of phonemes, from among the candidate sequences of phonemes associated with the selected syntax models, for which the product of the acoustic and syntax probabilities of different phonemes making up said sequence of phonemes is maximal, as constituting the oral instruction corresponding to the audio signal. 7. The automatic speech recognition method according to claim 6 , wherein the selection step comprises the following sub-steps: assigning, based on the detected state of the direction of the user's gaze and/or

Assignees

Dassault Aviat

Inventors

Classifications

B64U2201/20
Remote controls · CPC title
G10L15/187Primary
Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams · CPC title
G10L2015/228
of application context · CPC title
G10L15/22Primary
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
G10L15/1815
Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning · CPC title

Patent family

Related publications grouped by family.

View patent family 55451233

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10403274B2 cover?: An automatic speech recognition with detection of at least one contextual element, and application to aircraft flying and maintenance are provided. The automatic speech recognition device comprises a unit for acquiring an audio signal, a device for detecting the state of at least one contextual element, and a language decoder for determining an oral instruction corresponding to the audio signal…
Who is the assignee on this patent?: Dassault Aviat
What technology area does this patent fall under?: Primary CPC classification G10L15/187. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Sep 03 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).