Automated semantic inference of visual features and scenes

US10719744B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10719744-B2
Application numberUS-201816141812-A
CountryUS
Kind codeB2
Filing dateSep 25, 2018
Priority dateDec 28, 2017
Publication dateJul 21, 2020
Grant dateJul 21, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In one embodiment, an apparatus comprises a memory and a processor. The memory stores visual data captured by one or more sensors. The processor detects one or more first objects in the visual data based on a machine learning model and one or more first reference templates. The processor further determines, based on an object ontology, that the visual data is expected to contain a second object, wherein the object ontology indicates that the second object is related to the one or more first objects. The processor further detects the second object in the visual data based on the machine learning model and a second reference template. The processor further determines, based on an inference rule, that the visual data is expected to contain a third object. The processor further detects the third object in the visual data based on the machine learning model and a third reference template.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus, comprising: a memory to store visual data captured by one or more sensors; and a processor to: detect one or more first objects in the visual data based on a machine learning model and one or more first reference templates, wherein the one or more first reference templates are for object recognition of the one or more first objects; determine, based on an object ontology, that the visual data is expected to contain a second object, wherein the object ontology indicates that the second object is related to the one or more first objects; detect the second object in the visual data based on the machine learning model and a second reference template, wherein the second reference template is for object recognition of the second object; determine, based on an inference rule, that the visual data is expected to contain a third object; and detect the third object in the visual data based on the machine learning model and a third reference template, wherein the third reference template is for object recognition of the third object. 2. The apparatus of claim 1 , further comprising the one or more sensors, wherein the one or more sensors comprise a camera. 3. The apparatus of claim 1 , further comprising a communication interface to: obtain, via a network, the second reference template from a reference template repository, wherein the second reference template is obtained based at least in part on determining that the visual data is expected to contain the second object; and obtain, via the network, the third reference template from the reference template repository, wherein the third reference template is obtained based at least in part on determining that the visual data is expected to contain the third object. 4. The apparatus of claim 1 , wherein the object ontology comprises a representation of a hierarchy of objects at a plurality of levels of abstraction. 5. The apparatus of claim 4 , wherein the processor to determine, based on the object ontology, that the visual data is expected to contain the second object is further to: determine, based on the hierarchy of objects, that the second object is a parent of the one or more first objects. 6. The apparatus of claim 5 , wherein: the one or more first objects comprise a plurality of first objects; and the hierarchy of objects indicates that the second object is a common parent of the plurality of first objects. 7. The apparatus of claim 4 , wherein the processor to determine, based on the object ontology, that the visual data is expected to contain the second object is further to: determine, based on the hierarchy of objects, that the second object is a child of the one or more first objects. 8. The apparatus of claim 1 , wherein the inference rule comprises a plurality of conditions associated with recognizing a particular visual scene. 9. The apparatus of claim 8 , wherein the plurality of conditions indicate that the particular visual scene is associated with the one or more first objects, the second object, and the third object. 10. The apparatus of claim 9 , wherein the processor is further to: identify the inference rule from a plurality of inference rules, wherein the inference rule is identified based on a determination that the visual data and the particular visual scene associated with the inference rule each comprise the one or more first objects and the second object. 11. The apparatus of claim 9 , wherein the processor is further to: determine that the visual data comprises the particular visual scene associated with the inference rule, wherein the visual data satisfies the plurality of conditions associated with recognizing the particular visual scene. 12. A system, comprising: a camera to capture visual data representing an environment; a memory to store one or more first reference templates associated with object recognition of one or more first objects; a communication interface to receive, over a network from a reference template repository, a second reference template associated with object recognition of a second object and a third reference template associated with object recognition of a third object; and one or more processing devices to: detect the one or more first objects in the visual data based on a machine learning model and the one or more first reference templates; determine, based on an object ontology, that the visual data is expected to contain the second object, wherein the object ontology indicates that the second object is related to the one or more first objects; detect the second object in the visual data based on the machine learning model and the second reference template; determine, based on an inference rule, that the visual data is expected to contain the third object; and detect the third object in the visual data based on the machine learning model and the third reference template. 13. The system of claim 12 , wherein the one or more processing devices comprise: an object recognition processor to: detect the one or more first objects in the visual data based on the machine learning model and the one or more first reference templates; detect the second object in the visual data based on the machine learning model and the second reference template; and detect the third object in the visual data based on the machine learning model and the third reference template; a semantic processor to determine, based on the object ontology, that the visual data is expected to contain the second object; and an inference processor to determine, based on the inference rule, that the visual data is expected to contain the third object. 14. The system of claim 12 , further comprising: a cache to store a plurality of reference templates, wherein the plurality of reference templates comprises: the one or more first reference templates; the second reference template; or the third reference template; and a cache warmer to: determine that the plurality of reference templates may be needed for object recognition; retrieve the plurality of reference templates from the reference template repository; and store the plurality of reference templates in the cache. 15. The system of claim 12 , wherein the object ontology comprises a representation of a hierarchy of objects at a plurality of levels of abstraction. 16. The system of claim 15 , wherein the one or more processing devices to determine, based on the object ontology, that the visual data is expected to contain the second object are further to: determine, based on the hierarchy of objects, that the second object is a parent of the one or more first objects; or determine, based on the hierarchy of objects, that the second object is a child of the one or more first objects. 17. The system of claim 12 , wherein the inference rule comprises a plurality of conditions associated with recognizing a particular visual scene, wherein the plurality of conditions indicate that the particular visual scene comprises the one or more first objects, the second object, and the third object. 18. The system of claim 17 , wherein the one or more processing devices are further to: determine that the visual data comprises the particular visual scene associated with the inference rule, wherein the visual data satisfies the plurality of conditions associated with recognizing the particular visual scene. 19. At least one machine accessible storage medium having instructions stored thereon, wherein the instructions, when executed on a machine, cause the

Assignees

Inventors

Classifications

  • Feature selection, e.g. selecting representative features from a multi-dimensional feature space · CPC title

  • Syntactic or semantic context, e.g. balancing · CPC title

  • G06V20/00Primary

    Scenes; Scene-specific elements (control of digital cameras H04N23/60) · CPC title

  • Tree-organised classifiers · CPC title

  • Selection of the most significant subset of features · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10719744B2 cover?
In one embodiment, an apparatus comprises a memory and a processor. The memory stores visual data captured by one or more sensors. The processor detects one or more first objects in the visual data based on a machine learning model and one or more first reference templates. The processor further determines, based on an object ontology, that the visual data is expected to contain a second object…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06V20/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 21 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).