Method and apparatus for sparse associative recognition and recall for visual media reasoning
US-10176382-B1 · Jan 8, 2019 · US
US11462036B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11462036-B2 |
| Application number | US-202016947096-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 17, 2020 |
| Priority date | Dec 28, 2017 |
| Publication date | Oct 4, 2022 |
| Grant date | Oct 4, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In one embodiment, an apparatus comprises a memory and a processor. The memory stores visual data captured by one or more sensors. The processor detects one or more first objects in the visual data based on a machine learning model and one or more first reference templates. The processor further determines, based on an object ontology, that the visual data is expected to contain a second object, wherein the object ontology indicates that the second object is related to the one or more first objects. The processor further detects the second object in the visual data based on the machine learning model and a second reference template. The processor further determines, based on an inference rule, that the visual data is expected to contain a third object. The processor further detects the third object in the visual data based on the machine learning model and a third reference template.
Opening claim text (preview).
What is claimed is: 1. A processing device to perform scene recognition, comprising: interface circuitry to receive visual data captured by one or more sensors; and processing circuitry to: detect a first set of objects in the visual data based on one or more first machine learning models, wherein the one or more first machine learning models are trained to detect the first set of objects; determine, based on detecting the first set of objects, that the visual data is expected to contain a second object, wherein the second object is related to at least a subset of the first set of objects; detect the second object in the visual data based on a second machine learning model, wherein the second machine learning model is trained to detect the second object; and recognize a scene captured in the visual data, wherein the scene is recognized based on detecting the first set of objects and the second object in the visual data. 2. The processing device of claim 1 , wherein the processing circuitry to determine, based on detecting the first set of objects, that the visual data is expected to contain the second object is further to: determine that the visual data is expected to contain the second object based on an object ontology, wherein the object ontology indicates that the second object is related to at least the subset of the first set of objects. 3. The processing device of claim 2 , wherein the object ontology indicates hierarchical relationships among a plurality of objects at a plurality of levels of abstraction. 4. The processing device of claim 3 , wherein the processing circuitry to determine that the visual data is expected to contain the second object based on the object ontology is further to: determine, based on the hierarchical relationships among the plurality of objects, that the second object is a parent of at least the subset of the first set of objects. 5. The processing device of claim 3 , wherein the processing circuitry to determine that the visual data is expected to contain the second object based on the object ontology is further to: determine, based on the hierarchical relationships among the plurality of objects, that the second object is a child of at least the subset of the first set of objects. 6. The processing device of claim 1 , wherein the processing circuitry to recognize the scene captured in the visual data is further to: evaluate a scene inference rule against the visual data, wherein the scene inference rule indicates a set of criteria for recognizing the scene, and wherein the scene inference rule is evaluated based on the first set of objects and the second object detected in the visual data; determine that the visual data satisfies the set of criteria for recognizing the scene; and infer, based on determining that the visual data satisfies the set of criteria for recognizing the scene, that the scene is captured in the visual data. 7. The processing device of claim 6 , wherein the set of criteria indicates expected content within the scene. 8. The processing device of claim 1 , wherein the processing circuitry is further to: send, via the interface circuitry, a request to obtain the second machine learning model from a repository over a network, wherein the request is sent based at least in part on determining that the visual data is expected to contain the second object; and receive, via the interface circuitry, the second machine learning model from the repository over the network. 9. The processing device of claim 1 , wherein the processing circuitry is further to: determine, based on recognizing the scene captured in the visual data, that the visual data is expected to contain a third object, wherein the third object is related to the scene; and detect the third object in the visual data based on a third machine learning model, wherein the third machine learning model is trained to detect the third object. 10. The processing device of claim 1 , wherein the one or more sensors comprise a camera. 11. At least one non-transitory machine accessible storage medium having instructions stored thereon, wherein the instructions, when executed on processing circuitry, cause the processing circuitry to: receive, via interface circuitry, visual data captured by one or more sensors; detect a first set of objects in the visual data based on one or more first machine learning models, wherein the one or more first machine learning models are trained to detect the first set of objects; determine, based on detecting the first set of objects, that the visual data is expected to contain a second object, wherein the second object is related to at least a subset of the first set of objects; detect the second object in the visual data based on a second machine learning model, wherein the second machine learning model is trained to detect the second object; and recognize a scene captured in the visual data, wherein the scene is recognized based on detecting the first set of objects and the second object in the visual data. 12. The storage medium of claim 11 , wherein the instructions that cause the processing circuitry to determine, based on detecting the first set of objects, that the visual data is expected to contain the second object further cause the processing circuitry to: determine that the visual data is expected to contain the second object based on an object ontology, wherein the object ontology indicates that the second object is related to at least the subset of the first set of objects. 13. The storage medium of claim 12 , wherein the object ontology indicates hierarchical relationships among a plurality of objects at a plurality of levels of abstraction. 14. The storage medium of claim 13 , wherein the instructions that cause the processing circuitry to determine that the visual data is expected to contain the second object based on the object ontology further cause the processing circuitry to: determine, based on the hierarchical relationships among the plurality of objects, that the second object is a parent of at least the subset of the first set of objects. 15. The storage medium of claim 13 , wherein the instructions that cause the processing circuitry to determine that the visual data is expected to contain the second object based on the object ontology further cause the processing circuitry to: determine, based on the hierarchical relationships among the plurality of objects, that the second object is a child of at least the subset of the first set of objects. 16. The storage medium of claim 11 , wherein the instructions that cause the processing circuitry to recognize the scene captured in the visual data further cause the processing circuitry to: evaluate a scene inference rule against the visual data, wherein the scene inference rule indicates a set of criteria for recognizing the scene, and wherein the scene inference rule is evaluated based on the first set of objects and the second object detected in the visual data; determine that the visual data satisfies the set of criteria for recognizing the scene; and infer, based on determining that the visual data satisfies the set of criteria for recognizing the scene, that the scene is captured in the visual data. 17. The storage medium of claim 16 , wherein the set of criteria indicates expected content within the scene. 18. The storage medium of claim 11 , wherein the instructions further cause the processing circuitry to: send, via the interface circuitry, a request to obtain the second machine learning model from a repository ov
Feature selection, e.g. selecting representative features from a multi-dimensional feature space · CPC title
Syntactic or semantic context, e.g. balancing · CPC title
Scenes; Scene-specific elements (control of digital cameras H04N23/60) · CPC title
Tree-organised classifiers · CPC title
Selection of the most significant subset of features · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.