Explanation assisting system
US-2024412731-A1 · Dec 12, 2024 · US
US9508019B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9508019-B2 |
| Application number | US-201414190539-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 26, 2014 |
| Priority date | Mar 1, 2013 |
| Publication date | Nov 29, 2016 |
| Grant date | Nov 29, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An object recognition system is applicable to practical use, and utilizes image information besides speech information to improve recognition accuracy. The object recognition system comprises a speech recognition unit to determine candidates for a result of speech recognition on input speech and their likelihoods, and an image model generation unit to generate image models of a predetermined number of the candidates having the highest likelihoods. The system further comprises an image likelihood calculation unit to calculate image likelihoods of input images based on the image models, and an object recognition unit to perform object recognition using the image likelihoods. At the time of generating the image model of the candidate, the image model generation unit first searches an image model database, and, when the image model of the candidate is not found in the database, the image model generation unit generates said image model from image information on the web.
Opening claim text (preview).
The invention claimed is: 1. An object recognition system comprising a processor and one or more memories, the processor configured to: determine candidates as a result of speech recognition on input speech and their speech likelihoods; get image models of a predetermined number of the candidates having the highest speech likelihoods; calculate image likelihoods of the image model that each image model corresponds to an input image; and perform object recognition using the image likelihoods, wherein, in the step of getting image models, the processor searches an image model database for the image model, and then, when the image model of the candidate is not found in the database, the processor gets said image model from image information on the web. 2. The object recognition system according to claim 1 , wherein the processor performs the object recognition based on the speech likelihoods and the image likelihoods. 3. The object recognition system according to claim 2 , wherein, at the time of getting the image models of the candidates from image information on the web, the processor performs clustering of feature amounts of images collected from the web, and gets an image model for each of clusters. 4. The object recognition system according to claim 1 , wherein, at the time of getting the image models of the candidates from image information on the web, the processor performs clustering of feature amounts of images collected from the web, and gets an image model for each of clusters. 5. An object recognition method comprising steps of: determining candidates as a result of speech recognition on input speech and their likelihoods; getting image models of a predetermined number of the candidates having the highest likelihoods; calculating image likelihoods of the image models that each image model corresponds to an input image; and performing object recognition using the image likelihoods, wherein, in the step of getting image models, an image model database is searched for the image model, and then, when the image model of the candidate is not found in the database, said image model is gotten from image information on the web.
using a plurality of salient features, e.g. bag-of-words [BoW] representations · CPC title
Speech recognition (G10L17/00 takes precedence) · CPC title
for retrieval · CPC title
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.