Speech recognition system adaptation based on non-acoustic attributes

US2016140964A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016140964-A1
Application numberUS-201514790142-A
CountryUS
Kind codeA1
Filing dateJul 2, 2015
Priority dateNov 13, 2014
Publication dateMay 19, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method includes the following steps. A vicinity from which speech input to a speech recognition system originates is determined. Non-acoustic data from the vicinity of the speech is obtained using one or more non-acoustic sensors. A subject speaker is identified as the source of the speech input from the obtained non-acoustic data. One or more non-acoustic attributes of the subject speaker is analyzed. A speech recognition system is adjusted based on the one or more analyzed non-acoustic attributes.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method, comprising the steps of: determining a vicinity from which speech input to a speech recognition system originates; obtaining non-acoustic data from the vicinity of the speech using one or more non-acoustic sensors; identifying a subject speaker as the source of the speech input from the obtained non-acoustic data; analyzing one or more non-acoustic attributes of the subject speaker; and adjusting a speech recognition system based on the one or more analyzed non-acoustic attributes; wherein the steps are performed by at least one processor device coupled to a memory. 2 . The method of claim 1 , wherein determining the vicinity comprises locating a sound direction of the speech input. 3 . The method of claim 1 , wherein determining the vicinity comprises locating one or more head regions using at least one of the one or more non-acoustic sensors. 4 . The method of claim 1 , wherein obtaining non-acoustic data comprises capturing visual data of the vicinity of the speech input. 5 . The method of claim 4 , wherein identifying the subject speaker comprises segmenting one or more faces from the captured visual data. 6 . The method of claim 5 , wherein identifying the subject speaker further comprises: detecting mouth motion on the one or more faces; and selecting a face corresponding to the subject speaker based on the detected mouth motion. 7 . The method of claim 5 , wherein analyzing the one or more non-acoustic attributes comprises extracting one or more facial features of the subject speaker. 8 . The method of claim 4 , wherein analyzing the one or more non-acoustic attributes comprises extracting at least one of a hair color and an age. 9 . The method of claim 5 , wherein analyzing the one or more non-acoustic attributes comprises mapping the subject speaker to a cluster. 10 . The method of claim 9 , wherein the cluster comprises at least one of a gender cluster and an ethnicity cluster. 11 . The method of claim 1 , wherein adjusting the speech recognition system comprises selecting an acoustic model for use by the speech recognition system. 12 . The method of claim 1 , wherein adjusting the speech recognition system comprises selecting a language model for use by the speech recognition system.

Assignees

Inventors

Classifications

  • G10L15/07Primary

    to the speaker · CPC title

  • Speech recognition using non-acoustical features · CPC title

  • G10L15/25Primary

    using position of the lips, movement of the lips or face analysis · CPC title

  • of the speaker; Human-factor methodology · CPC title

  • based on physical entities controlled by simulated intelligence so as to replicate intelligent life forms, e.g. based on robots replicating pets or humans in their appearance or behaviour · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016140964A1 cover?
A method includes the following steps. A vicinity from which speech input to a speech recognition system originates is determined. Non-acoustic data from the vicinity of the speech is obtained using one or more non-acoustic sensors. A subject speaker is identified as the source of the speech input from the obtained non-acoustic data. One or more non-acoustic attributes of the subject speaker is…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G10L15/07. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu May 19 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).