What technology area does this patent fall under?

Primary CPC classification G06F3/0425. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Apr 22 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method and system for predicting touch interaction position on large display based on binocular camera

US12282633B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12282633-B2
Application number	US-202318242040-A
Country	US
Kind code	B2
Filing date	Sep 5, 2023
Priority date	Sep 5, 2022
Publication date	Apr 22, 2025
Grant date	Apr 22, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed is a method and system for predicting a touch interaction position on a large display based on a binocular camera. The method includes: separately acquiring arm movement video frames of a user and facial and eye movement video frames of the user by a binocular camera; extracting a video clip of each tapping action from the arm movement video frames and the facial and eye movement video frames and obtaining a key frame by screening; marking the key frame of each tapping action with coordinates to indicate coordinates of a finger in a display screen; inputting the marked key frame to an efficient convolutional network for online video understanding (ECO)-Lite neural network for training to obtain a predictive network model; and inputting a video frame of a current operation to be predicted to the predictive network model and outputting a touch interaction position predicted for the current operation.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for predicting a touch interaction position on a large display based on a binocular camera, comprising the following steps: S 1 , separately acquiring arm movement video frames of a user and facial and eye movement video frames of the user by a binocular camera; S 2 , extracting a video clip of each tapping action from the arm movement video frames and the facial and eye movement video frames and obtaining a key frame by screening; S 3 , marking the key frame of each tapping action with coordinates to indicate coordinates of a finger in a display screen; S 4 , inputting the marked key frame to an efficient convolutional network for online video understanding (ECO)-Lite neural network for training to obtain a predictive network model; and S 5 , inputting a video frame of a current operation to be predicted to the predictive network model and outputting a touch interaction position predicted for the current operation. 2. The method for predicting a touch interaction position on a large display based on a binocular camera according to claim 1 , wherein in step S 1 , a camera is disposed right above a middle of a display and configured to acquire the facial and eye movement video frames of the user; and a network camera is disposed on a side of the display to acquire the arm movement video frames of the user. 3. The method for predicting a touch interaction position on a large display based on a binocular camera according to claim 1 , wherein in step S 2 , when extracting the key frame of each tapping action, 1000 ms before completion of each tapping event is split as a tapping action, and video clips of a plurality of tapping actions are obtainable by splitting; and for each video clip, an image frame with no movement is removed from 1000 ms video frames, and the key frame of each tapping action is obtained by extraction from remaining video frames at an interval of 50 ms. 4. The method for predicting a touch interaction position on a large display based on a binocular camera according to claim 3 , wherein a condition for determining the image frame with no movement is as follows: redundant information of adjacent image frames is greater than 90%. 5. The method for predicting a touch interaction position on a large display based on a binocular camera according to claim 3 , wherein step S 4 comprises the following steps: S 41 , taking a key frame extracted from the arm movement video frames and a key frame extracted from the facial and eye movement video frames as model inputs; S 42 , performing convolutional processing using a convolution pool part, extracting two-dimensional (2D) image features by a 2D network, and arranging the extracted 2D image features in an order of video frames; S 43 , taking the arranged 2D image features and an arrangement relationship as inputs to a three-dimensional (3D) convolution for end-to-end fusion to acquire movement features; and S 44 , merging a movement motion feature and facial and eye movement features after the 3D convolution, followed by inputting to a fully connected layer for result prediction and comparison with the marked coordinates, and calculating a loss value for parameter adjustment to obtain the predictive network model. 6. The method for predicting a touch interaction position on a large display based on a binocular camera according to claim 5 , wherein in step S 42 , the 2D network is batch normalization (BN)-Inception. 7. The method for predicting a touch interaction position on a large display based on a binocular camera according to claim 5 , wherein in step S 43 , the 3D convolution is 3D-Resnet18.

Assignees

Univ Hangzhou Dianzi

Inventors

Classifications

G06V10/774
Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
G06V10/82
using neural networks · CPC title
G06V40/28
Recognition of hand or arm movements, e.g. recognition of deaf sign language (static hand signs G06V40/113) · CPC title
G06V20/46
Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames · CPC title
G06V10/7715
Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods · CPC title

Patent family

Related publications grouped by family.

View patent family 84071808

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12282633B2 cover?: Disclosed is a method and system for predicting a touch interaction position on a large display based on a binocular camera. The method includes: separately acquiring arm movement video frames of a user and facial and eye movement video frames of the user by a binocular camera; extracting a video clip of each tapping action from the arm movement video frames and the facial and eye movement vide…
Who is the assignee on this patent?: Univ Hangzhou Dianzi
What technology area does this patent fall under?: Primary CPC classification G06F3/0425. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Apr 22 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Methods and systems of display edge interactions in a gesture-controlled device

Method for recognizing actions, device and storage medium

Method and system for providing remote robotic control

System and Method for Verifying Liveliness

Frequently asked questions