Method and system for obtaining joint positions, and method and system for motion capture
US-2022108468-A1 · Apr 7, 2022 · US
US12118741B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12118741-B2 |
| Application number | US-201917617431-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 13, 2019 |
| Priority date | Jun 13, 2019 |
| Publication date | Oct 15, 2024 |
| Grant date | Oct 15, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present invention provides a processing apparatus (20) including a first generation unit (22) that generates, from a plurality of time-series images, three-dimensional feature information indicating a time change of a feature in each position in each of the plurality of images, a second generation unit (23) that generates person position information indicating a position in which a person is present in each of the plurality of images, and an estimation unit (24) that estimates person behavior indicated by the plurality of images, based on the time change of the feature indicated by the three-dimensional feature information in the position in which the person is present being indicated by the person position information.
Opening claim text (preview).
What is claimed is: 1. A processing apparatus comprising: at least one memory storing one or more instructions; and at least one processor configured to execute the one or more instructions to: generate, from a plurality of time-series images, three-dimensional feature information indicating a time change of a feature in each position in-each of the plurality of time-series images by inputting the plurality of time-series images to a CNN (Convolutional Neural Network), the CNN receiving input of the plurality of time-series images and outputting the three-dimensional feature information; generate person position information indicating a position in which a person is present in each of the plurality of time-series images; adjust the three-dimensional feature information by changing a value in a position in which no person is present to a predetermined value based on the person position information; and estimate person behavior indicated by the plurality of time-series images, based on the adjusted three-dimensional feature information indicating the predetermined value in positions in which no person is present and indicating the time change of the feature in positions in which the person is present. 2. The processing apparatus according to claim 1 , wherein the at least one processor configured is to execute the one or more instructions to: generate the three-dimensional feature information, based on a 3D convolutional neural network (CNN); and generate the person position information, based on a deep learning network of object recognition. 3. The processing apparatus according to claim 1 , wherein, the at least one processor is configured to execute the one or more instructions to generate, in case in which a plurality of persons are present in the image, the person position information indicating a position in which each of the plurality of persons is present. 4. A processing method performed by a computer and comprising: generating, from a plurality of time-series images, three-dimensional feature information indicating a time change of a feature in each position in-each of the plurality of time-series images by inputting the plurality of time-series images to a CNN (Convolutional Neural Network), the CNN receiving input of the plurality of time-series images and outputting the three-dimensional feature information; generating person position information indicating a position in which a person is present in each of the plurality of time-series images; adjusting the three-dimensional feature information by changing a value in a position in which no person is present to a predetermined value based on the person position information; and estimating person behavior indicated by the plurality of time-series images, based on the adjusted three-dimensional feature information indicating the predetermined value in positions in which no person is present and indicating the time change of the feature in positions in which the person is present. 5. A non-transitory storage medium storing a program causing a computer to: generate, from a plurality of time-series images, three-dimensional feature information indicating a time change of a feature in each position in-each of the plurality of time-series images by inputting the plurality of time-series images to a CNN (Convolutional Neural Network), the CNN receiving input of the plurality of time-series images and outputting the three-dimensional feature information; generate person position information indicating a position in which a person is present in each of the plurality of time-series images; adjust the three-dimensional feature information by changing a value in a position in which no person is present to a predetermined value based on the person position information; and estimate person behavior indicated by the plurality of time-series images, based on the adjusted three-dimensional feature information indicating the predetermined value in positions in which no person is present and indicating the time change of the feature in positions in which the person is present. 6. The processing method according to claim 4 , wherein the computer generates the three-dimensional feature information based on a 3D convolutional neural network (CNN), and generates the person position information based on a deep learning network of object recognition. 7. The processing method according to claim 4 , wherein the computer generates, in case in which a plurality of persons are present in the image, the person position information indicating a position in which each of the plurality of persons is present. 8. The non-transitory storage medium according to claim 5 , wherein the program causes the computer to: generate the three-dimensional feature information, based on a 3D convolutional neural network (CNN); and generate the person position information, based on a deep learning network of object recognition. 9. The non-transitory storage medium according to claim 5 , wherein the program causes the computer to: generate, in case in which a plurality of persons are present in the image, the person position information indicating a position in which each of the plurality of persons is present.
using neural networks · CPC title
Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands · CPC title
Surveillance or monitoring of activities, e.g. for recognising suspicious objects (recognising microscopic objects G06V20/69) · CPC title
Movements or behaviour, e.g. gesture recognition (recognition of facial expressions G06V40/16) · CPC title
using television cameras · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.