Computer-implemented method, method of training deep learning model, electronic device, and medium
US-2024394871-A1 · Nov 28, 2024 · US
US2024371148A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2024371148-A1 |
| Application number | US-202118562771-A |
| Country | US |
| Kind code | A1 |
| Filing date | May 26, 2021 |
| Priority date | May 26, 2021 |
| Publication date | Nov 7, 2024 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
It is possible to identify a position of a target object that is difficult to recognize.A position estimation device includes: an information fusion unit that generates fusion information in which position information of a subject object that is an object corresponding to a subject, visual information of the subject object, and relationship information indicating a relationship with a target object paired with the subject object are fused; and an object position estimation unit that estimates a position of the target object by using an object position estimator learned in advance on the basis of the fusion information.
Opening claim text (preview).
1 . A position estimation device comprising a processor configured to execute operations comprising: generating fusion information, wherein the fusion information includes, according to fusing: position information of a subject object that is an object corresponding to a subject, visual information of the subject object, and relationship information indicating a relationship with a target object paired with the subject object; and estimating a position of the target object using an object position estimation model learned in advance on the basis of the fusion information. 2 . The position estimation device according to claim 1 , wherein the relationship information uses a vector represented by using a word s, a word o, and a word w, wherein the word s indicates a name of the subject object, the word o indicates a name of the target object, and the word w indicates a relationship between the word s and the word o. 3 . The position estimation device according to claim 1 , wherein the object position estimation model is learned to optimize the position information of the target object and estimated position information to be calculated by: calculating relative position information and the position information of the target object for learning, wherein the relative position information represents a correct answer of the position information of the subject object for learning, and updating a parameter so as to reduce a distance between the estimated position information and the relative position information. 4 . A position estimation learning device comprising a processor configured to execute operations comprising: receiving, as learning data, position information of a subject object that is an object corresponding to a subject, visual information of the subject object, position information of a target object paired with the subject object, and relationship information indicating a relationship with the target object paired with the subject object; generating generates fusion information, wherein the fusion information includes, according to fusing, the position information of the subject object, the visual information of the subject object, and the relationship information; estimating estimated position information by using an object position estimation model on the basis of the fusion information; and calculating relative position information and the position information of the target object, wherein the relative position information represents a correct answer of the position information of the subject object; and updating a parameter of the object position estimation model to reduce a distance between the estimated position information and the relative position information so as to optimize the position information of the target object and the calculated estimated position information. 5 . A position estimation method comprising: generating fusion information, wherein the fusion information includes, according to fusing, position information of a subject object that is an object corresponding to a subject, visual information of the subject object, and relationship information indicating a relationship with a target object paired with the subject object; and estimating a position of the target object using an object position estimation model learned in advance on the basis of the fusion information. 6 - 8 . (canceled) 9 . The position estimation device according to claim 1 , wherein the subject object includes a person, and the target object includes a smartphone held by the person, and the smartphone is partially hidden in the visual information of the subject object. 10 . The position estimation device according to claim 1 , wherein the visual information of the subject object includes a tensor output of an image input of a region of the subject object. 11 . The position estimation device according to claim 1 , further comprising: displaying, based on the estimated position of the target object, position information of the target object in an image input, wherein the image input indicates at least a part of the subject object and at least a part of the target object. 12 . The position estimation device according to claim 1 , wherein the relationship information is based on a vector output of a word2vec model, and the word2vec model outputs the vector output based on a word input. 13 . The position estimation device according to claim 1 , wherein the object position estimation model uses a neural network, and the neural network outputs the position of the target object based on the fusion information. 14 . The position estimation learning device according to claim 4 , wherein the relationship information uses a vector represented by using a word s, a word o, and a word w, wherein the word s indicates a name of the subject object, the word o indicates a name of the target object, and the word w indicates a relationship between the word s and the word o. 15 . The position estimation learning device according to claim 4 , wherein the subject object includes a person, and the target object includes a smartphone held by the person, and the smartphone is partially hidden in the visual information of the subject object. 16 . The position estimation learning device according to claim 4 , wherein the visual information of the subject object includes a tensor output of an image input of a region of the subject object. 17 . The position estimation learning device according to claim 4 , further comprising: displaying, based on the estimated position of the target object, position information of the target object in an image input, wherein the image input indicates at least a part of the subject object and at least a part of the target object. 18 . The position estimation learning device according to claim 4 , wherein the object position estimation model uses a neural network, and the neural network outputs the position of the target object based on the fusion information. 19 . The position estimation method according to claim 5 , wherein the object position estimation model is learned to optimize the position information of the target object and estimated position information to be calculated by: calculating relative position information and the position information of the target object for learning, wherein the relative position information represents a correct answer of the position information of the subject object for learning, and updating a parameter so as to reduce a distance between the estimated position information and the relative position information. 20 . The position estimation method according to claim 5 , wherein the subject object includes a person, and the target object includes a smartphone held by the person, and the smartphone is partially hidden in the visual information of the subject object. 21 . The position estimation method according to claim 5 , further comprising: displaying, based on the estimated position of the target object, position information of the target object in an image input, wherein the image input indicates at least a part of the subject object and at least a part of the target object. 22 . The position estimation method according to claim 5 , wherein the visual information of the subject object includes a tensor output of an image input of a region of the subject object, and wherein the relationship information is based on a vector output of a word2vec model, and the word2vec model outputs the vector output based on a word input.
using feature-based methods · CPC title
Determining position or orientation of objects or cameras (camera calibration G06T7/80) · CPC title
Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands · CPC title
using neural networks · CPC title
using syntactic or structural representations of the image or video pattern, e.g. symbolic string recognition; using graph matching · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.