Context embedding for capturing image dynamics
US-2019325273-A1 · Oct 24, 2019 · US
US11221681B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11221681-B2 |
| Application number | US-201916530190-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 2, 2019 |
| Priority date | Dec 22, 2017 |
| Publication date | Jan 11, 2022 |
| Grant date | Jan 11, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for recognizing a dynamic gesture includes: positioning a dynamic gesture in a video stream to be detected to obtain a dynamic gesture box; capturing an image block corresponding to the dynamic gesture box from each of multiple image frames of the video stream; generating a detection sequence based on the captured image block; and performing dynamic gesture recognition according to the detection sequence.
Opening claim text (preview).
The invention claimed is: 1. A method for recognizing a dynamic gesture, comprising: positioning a dynamic gesture in a video stream to be detected to obtain a dynamic gesture box; capturing an image block corresponding to the dynamic gesture box from each of multiple image frames of the video stream, wherein respective parts of the multiple image frames, which are out of the dynamic gesture box, are removed; generating a detection sequence based on the captured image blocks, wherein the detection sequence is a sequence of images different from the multiple image frames of the video stream; and performing dynamic gesture recognition according to the detection sequence, wherein the performing dynamic gesture recognition according to the detection sequence comprises: determining multiple inter-frame image differences in the detection sequence, wherein each of the multiple inter-frame image differences is an image obtained by calculating a difference between pixels at each same position in two adjacent or non-adjacent image frames; generating an image difference sequence based on the multiple inter-frame image differences; and performing the dynamic gesture recognition according to the detection sequence and the image difference sequence, which comprises: inputting the detection sequence into a first dynamic gesture recognition model to obtain a first dynamic gesture category prediction probability output by the first dynamic gesture recognition model; inputting the image difference sequence into a second dynamic gesture recognition model to obtain a second dynamic gesture category prediction probability output by the second dynamic gesture recognition model; and determining a dynamic gesture recognition result according to the first dynamic gesture category prediction probability and the second dynamic gesture category prediction probability. 2. A control method using gesture interaction, comprising: obtaining a video stream; determining a dynamic gesture recognition result of the video stream by the method according to claim 1 ; and controlling a device to execute an operation corresponding to the dynamic gesture recognition result. 3. The method according to claim 2 , wherein the controlling a device to execute an operation corresponding to the dynamic gesture recognition result comprises: obtaining the operation instruction corresponding to the dynamic gesture recognition result according to a predetermined correspondence between the dynamic gesture recognition result and the operation instruction; and controlling the device to execute a corresponding operation according to the operation instruction; or wherein the controlling a device to execute an operation corresponding to the dynamic gesture recognition result comprises: in response to the dynamic gesture recognition result being a predefined dynamic action, controlling a vehicle to execute an operation corresponding to the predefined dynamic action. 4. The method according to claim 3 , wherein the controlling the device to execute a corresponding operation according to the operation instruction comprises: controlling a window, a door, or a vehicle-mounted system of a vehicle according to the operation instruction. 5. The method according to claim 3 , wherein the predefined dynamic action comprises a dynamic gesture comprising at least one of: single-finger clockwise/counterclockwise rotation, palm left/right swing, two-finger poke, extending the thumb and pinky finger, press-down with the palm downward, lift with the palm upward, fanning to the left/right with the palm, left/right movement with the thumb extended, long slide to the left/right with the palm, changing a fist into a palm with the palm upward, changing a palm into a fist with the palm upward, changing a palm into a fist with the palm downward, changing a fist into a palm with the palm downward, single-finger slide, pinch-in with multiple fingers, single-finger double click, single-finger single click, multi-finger double click, or multi-finger single click; and the operation corresponding to the predefined dynamic action comprises at least one of: volume up/down, song switching, song pause/resume, call answering or initiation, hang-up or call rejection, air conditioning temperature increase or decrease, multi-screen interaction, sunroof opening, sunroof closing, door lock locking, door lock unlocking, drag for navigation, map zoom-out, or map zoom-in. 6. An electronic device, comprising: a memory storing processor-executable instructions; and a processor, configured to execute the stored processor-executable instructions to perform operations of the control method using gesture interaction according to claim 3 . 7. The method according to claim 1 , wherein the positioning a dynamic gesture in a video stream to be detected to obtain a dynamic gesture box comprises: positioning a static gesture in at least one image frame of the multiple image frames of the video stream to obtain a static gesture box of the at least one image frame; and determining the dynamic gesture box according to the static gesture box of the at least one image frame. 8. The method according to claim 7 , wherein the determining the dynamic gesture box according to the static gesture box of the at least one image frame comprises: enlarging the static gesture box of the at least one image frame to obtain the dynamic gesture box. 9. The method according to claim 7 , wherein the static gesture box of the at least one image frame of the multiple image frames of the video stream meets the following condition: the static gesture box is located within the dynamic gesture box, or the static gesture box is as same as the dynamic gesture box. 10. The method according to claim 1 , wherein before the performing the dynamic gesture recognition according to the detection sequence and the image difference sequence, the method further comprises: establishing the first dynamic gesture recognition model by: collecting one or more sample video streams involving different categories of dynamic gestures; annotating dynamic gesture boxes of the different categories of dynamic gestures; capturing image blocks corresponding to annotation information of the dynamic gesture boxes from multiple image frames of the sample video stream to form an image sequence; and training the first dynamic gesture recognition model by using categories of the dynamic gestures as supervision data and using the image sequence as training data. 11. The method according to claim 10 , wherein the training the first dynamic gesture recognition model by using categories of the dynamic gestures as supervision data and using the image sequence as training data comprises: dividing the image sequence into at least one segment; extracting a preset number of image frames from the at least one segment, and stacking the image frames to form image training data; and training the first dynamic gesture recognition model by using the categories of the dynamic gestures as the supervision data and using the image training data. 12. The method according to claim 1 , wherein before the performing dynamic gesture recognition according to the detection sequence and the image difference sequence, the method further comprises: establishing the second dynamic gesture recognition model by the following means: collecting one or more sample video streams involving different categories of dynamic gestures; annotating dynamic gesture boxes of the different categories of dynamic gestures; capturing image blocks corresponding to annotation information of the dynamic gesture boxes from multiple image frames of the o
using neural networks · CPC title
Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title
using classification, e.g. of video objects · CPC title
Distances to neighbourhood prototypes, e.g. restricted Coulomb energy networks [RCEN] · CPC title
Recognition of hand or arm movements, e.g. recognition of deaf sign language (static hand signs G06V40/113) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.