Apparatus and method for recognizing whether action objective is achieved
US-2022319169-A1 · Oct 6, 2022 · US
US12548244B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12548244-B2 |
| Application number | US-202318139929-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 26, 2023 |
| Priority date | Jul 7, 2021 |
| Publication date | Feb 10, 2026 |
| Grant date | Feb 10, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method and apparatus for processing an action of a virtual object, and a storage medium are provided. The method specifically includes: receiving an action instruction, the action instruction including: an action identifier and time-dependent information of performing an action associated with the action identifier; determining an action video frame sequence corresponding to the action identifier; determining, from the action video frame sequence, an action state image corresponding to a preset state image of the virtual object at a target time, the target time being determined according to the time-dependent information; generating a connection video frame sequence according to the action state image, the connection video frame sequence connecting the preset state image with the action video frame sequence; and splicing the connection video frame sequence with the action video frame sequence, to obtain an action video. Embodiments of this application can improve action processing efficiency of a virtual object.
Opening claim text (preview).
What is claimed is: 1 . A method for processing an action of a virtual object performed by a computer device, the method comprising: receiving an action instruction, the action instruction comprising: an action identifier and time-dependent information of performing an action associated with the action identifier; determining, among a plurality of pre-generated action video frame sequences, each action video frame sequence corresponding to an action performed by the virtual object and having a corresponding action identifier, an action video frame sequence corresponding to the action identifier; determining, from the action video frame sequence, an action state image corresponding to a preset state image of the virtual object at a target time, the target time being determined according to the time-dependent information, further including: comparing a visual feature corresponding to the preset state image with a visual feature corresponding to a candidate action state image in the action video frame sequence to obtain a match value between the preset state image and the candidate action state image; and choosing the candidate action state image having a maximum match value with the preset state image as the action state image corresponding to the preset state image of the virtual object at the target time; obtaining a pair of images including an aligned preset state image and an aligned action state image by performing pose information alignment on the preset state image and the action state image that further improves a matching degree between the virtual object in the preset state image and the virtual object in the action state image; generating a connection video frame sequence according to the aligned preset state image and the aligned action state image, the connection video frame sequence representing a transition from a preset state corresponding to the preset state image to an action state corresponding to the action state image and connecting the preset state image with the action video frame sequence; and splicing the connection video frame sequence with the action video frame sequence, to obtain an action video. 2 . The method according to claim 1 , wherein the generating a connection video frame sequence according to the action state image comprises: determining optical flow features separately corresponding to the action state image; and generating the connection video frame sequence according to the optical flow features. 3 . The method according to claim 1 , wherein the generating a connection video frame sequence according to the action state image comprises: determining optical flow features and texture features and/or deep features separately corresponding to the action state image; and generating the connection video frame sequence according to the optical flow features and the texture features and/or the deep features. 4 . The method according to claim 1 , wherein the pose information includes position information and posture information of the virtual object in the preset state image and the action state image. 5 . The method according to claim 1 , further comprising: extracting a part preset state image from the preset state image, and determining, on the basis of three-dimensional reconstruction, a third visual feature corresponding to the part preset state image; extracting a part action state image from the action state image, and determining, on the basis of three-dimensional reconstruction, a fourth visual feature corresponding to the part action state image; generating a part connection video frame sequence according to the third visual feature and the fourth visual feature; and adding the part connection video frame sequence to the connection video frame sequence. 6 . The method according to claim 1 , wherein the time-dependent information comprises: text information corresponding to the action identifier. 7 . A computer device, comprising a processor and a memory, the memory storing a program, the program, when executed by the processor, causing the computer device to perform a method for processing an action of a virtual object including: receiving an action instruction, the action instruction comprising: an action identifier and time-dependent information of performing an action associated with the action identifier; determining, among a plurality of pre-generated action video frame sequences, each action video frame sequence corresponding to an action performed by the virtual object and having a corresponding action identifier, an action video frame sequence corresponding to the action identifier; determining, from the action video frame sequence, an action state image corresponding to a preset state image of the virtual object at a target time, the target time being determined according to the time-dependent information, further including: comparing a visual feature corresponding to the preset state image with a visual feature corresponding to a candidate action state image in the action video frame sequence to obtain a match value between the preset state image and the candidate action state image; and choosing the candidate action state image having a maximum match value with the preset state image as the action state image corresponding to the preset state image of the virtual object at the target time; obtaining a pair of images including an aligned preset state image and an aligned action state image by performing pose information alignment on the preset state image and the action state image that further improves a matching degree between the virtual object in the preset state image and the virtual object in the action state image; generating a connection video frame sequence according to the aligned preset state image and the aligned action state image, the connection video frame sequence representing a transition from a preset state corresponding to the preset state image to an action state corresponding to the action state image and connecting the preset state image with the action video frame sequence; and splicing the connection video frame sequence with the action video frame sequence, to obtain an action video. 8 . The computer device according to claim 7 , wherein the generating a connection video frame sequence according to the action state image comprises: determining optical flow features separately corresponding to the action state image; and generating the connection video frame sequence according to the optical flow features. 9 . The computer device according to claim 7 , wherein the generating a connection video frame sequence according to the action state image comprises: determining optical flow features and texture features and/or deep features separately corresponding to the action state image; and generating the connection video frame sequence according to the optical flow features and the texture features and/or the deep features. 10 . The computer device according to claim 7 , wherein the pose information includes position information and posture information of the virtual object in the preset state image and the action state image. 11 . The computer device according to claim 7 , wherein the method further comprises: extracting a part preset state image from the preset state image, and determining, on the basis of three-dimensional reconstruction, a third visual feature corresponding to the part preset state image; extracting a part action state image from the action state image, and determining, on the basis of three-dimensional reconstruction, a fourth visual feature corresponding to the part action state image; generating a part connection video frame sequence according to th
Mixing · CPC title
Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components · CPC title
Aligning, centring, orientation detection or correction of the image · CPC title
Proximity, similarity or dissimilarity measures · CPC title
relating to texture · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.