Automated semantic inference of visual features and scenes

US11462036B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11462036-B2
Application numberUS-202016947096-A
CountryUS
Kind codeB2
Filing dateJul 17, 2020
Priority dateDec 28, 2017
Publication dateOct 4, 2022
Grant dateOct 4, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In one embodiment, an apparatus comprises a memory and a processor. The memory stores visual data captured by one or more sensors. The processor detects one or more first objects in the visual data based on a machine learning model and one or more first reference templates. The processor further determines, based on an object ontology, that the visual data is expected to contain a second object, wherein the object ontology indicates that the second object is related to the one or more first objects. The processor further detects the second object in the visual data based on the machine learning model and a second reference template. The processor further determines, based on an inference rule, that the visual data is expected to contain a third object. The processor further detects the third object in the visual data based on the machine learning model and a third reference template.

First claim

Opening claim text (preview).

What is claimed is: 1. A processing device to perform scene recognition, comprising: interface circuitry to receive visual data captured by one or more sensors; and processing circuitry to: detect a first set of objects in the visual data based on one or more first machine learning models, wherein the one or more first machine learning models are trained to detect the first set of objects; determine, based on detecting the first set of objects, that the visual data is expected to contain a second object, wherein the second object is related to at least a subset of the first set of objects; detect the second object in the visual data based on a second machine learning model, wherein the second machine learning model is trained to detect the second object; and recognize a scene captured in the visual data, wherein the scene is recognized based on detecting the first set of objects and the second object in the visual data. 2. The processing device of claim 1 , wherein the processing circuitry to determine, based on detecting the first set of objects, that the visual data is expected to contain the second object is further to: determine that the visual data is expected to contain the second object based on an object ontology, wherein the object ontology indicates that the second object is related to at least the subset of the first set of objects. 3. The processing device of claim 2 , wherein the object ontology indicates hierarchical relationships among a plurality of objects at a plurality of levels of abstraction. 4. The processing device of claim 3 , wherein the processing circuitry to determine that the visual data is expected to contain the second object based on the object ontology is further to: determine, based on the hierarchical relationships among the plurality of objects, that the second object is a parent of at least the subset of the first set of objects. 5. The processing device of claim 3 , wherein the processing circuitry to determine that the visual data is expected to contain the second object based on the object ontology is further to: determine, based on the hierarchical relationships among the plurality of objects, that the second object is a child of at least the subset of the first set of objects. 6. The processing device of claim 1 , wherein the processing circuitry to recognize the scene captured in the visual data is further to: evaluate a scene inference rule against the visual data, wherein the scene inference rule indicates a set of criteria for recognizing the scene, and wherein the scene inference rule is evaluated based on the first set of objects and the second object detected in the visual data; determine that the visual data satisfies the set of criteria for recognizing the scene; and infer, based on determining that the visual data satisfies the set of criteria for recognizing the scene, that the scene is captured in the visual data. 7. The processing device of claim 6 , wherein the set of criteria indicates expected content within the scene. 8. The processing device of claim 1 , wherein the processing circuitry is further to: send, via the interface circuitry, a request to obtain the second machine learning model from a repository over a network, wherein the request is sent based at least in part on determining that the visual data is expected to contain the second object; and receive, via the interface circuitry, the second machine learning model from the repository over the network. 9. The processing device of claim 1 , wherein the processing circuitry is further to: determine, based on recognizing the scene captured in the visual data, that the visual data is expected to contain a third object, wherein the third object is related to the scene; and detect the third object in the visual data based on a third machine learning model, wherein the third machine learning model is trained to detect the third object. 10. The processing device of claim 1 , wherein the one or more sensors comprise a camera. 11. At least one non-transitory machine accessible storage medium having instructions stored thereon, wherein the instructions, when executed on processing circuitry, cause the processing circuitry to: receive, via interface circuitry, visual data captured by one or more sensors; detect a first set of objects in the visual data based on one or more first machine learning models, wherein the one or more first machine learning models are trained to detect the first set of objects; determine, based on detecting the first set of objects, that the visual data is expected to contain a second object, wherein the second object is related to at least a subset of the first set of objects; detect the second object in the visual data based on a second machine learning model, wherein the second machine learning model is trained to detect the second object; and recognize a scene captured in the visual data, wherein the scene is recognized based on detecting the first set of objects and the second object in the visual data. 12. The storage medium of claim 11 , wherein the instructions that cause the processing circuitry to determine, based on detecting the first set of objects, that the visual data is expected to contain the second object further cause the processing circuitry to: determine that the visual data is expected to contain the second object based on an object ontology, wherein the object ontology indicates that the second object is related to at least the subset of the first set of objects. 13. The storage medium of claim 12 , wherein the object ontology indicates hierarchical relationships among a plurality of objects at a plurality of levels of abstraction. 14. The storage medium of claim 13 , wherein the instructions that cause the processing circuitry to determine that the visual data is expected to contain the second object based on the object ontology further cause the processing circuitry to: determine, based on the hierarchical relationships among the plurality of objects, that the second object is a parent of at least the subset of the first set of objects. 15. The storage medium of claim 13 , wherein the instructions that cause the processing circuitry to determine that the visual data is expected to contain the second object based on the object ontology further cause the processing circuitry to: determine, based on the hierarchical relationships among the plurality of objects, that the second object is a child of at least the subset of the first set of objects. 16. The storage medium of claim 11 , wherein the instructions that cause the processing circuitry to recognize the scene captured in the visual data further cause the processing circuitry to: evaluate a scene inference rule against the visual data, wherein the scene inference rule indicates a set of criteria for recognizing the scene, and wherein the scene inference rule is evaluated based on the first set of objects and the second object detected in the visual data; determine that the visual data satisfies the set of criteria for recognizing the scene; and infer, based on determining that the visual data satisfies the set of criteria for recognizing the scene, that the scene is captured in the visual data. 17. The storage medium of claim 16 , wherein the set of criteria indicates expected content within the scene. 18. The storage medium of claim 11 , wherein the instructions further cause the processing circuitry to: send, via the interface circuitry, a request to obtain the second machine learning model from a repository ov

Assignees

Inventors

Classifications

  • Feature selection, e.g. selecting representative features from a multi-dimensional feature space · CPC title

  • G06V30/274Primary

    Syntactic or semantic context, e.g. balancing · CPC title

  • G06V20/00Primary

    Scenes; Scene-specific elements (control of digital cameras H04N23/60) · CPC title

  • Tree-organised classifiers · CPC title

  • Selection of the most significant subset of features · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11462036B2 cover?
In one embodiment, an apparatus comprises a memory and a processor. The memory stores visual data captured by one or more sensors. The processor detects one or more first objects in the visual data based on a machine learning model and one or more first reference templates. The processor further determines, based on an object ontology, that the visual data is expected to contain a second object…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06V30/274. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 04 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).