What technology area does this patent fall under?

Primary CPC classification G06V10/764. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Aug 12 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

System and method for learning long-distance recognition and personalization of gestures

US12387472B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12387472-B2
Application number	US-202218068091-A
Country	US
Kind code	B2
Filing date	Dec 19, 2022
Priority date	Dec 19, 2022
Publication date	Aug 12, 2025
Grant date	Aug 12, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented system and method relate to gesture recognition. A machine learning system is trained using a training dataset of sensor data that include a set of gestures. The training dataset includes at least a first subset that displays a first gesture. Loss data is generated based on a first loss function that includes a first cross entropy loss and a second cross entropy loss. Parameters of the machine learning system are updated based on the loss data. The machine learning system is outputted and configured for gesture recognition of the set of gestures. The machine learning system includes (i) a first subnetwork to generate feature data based on the sensor data, (ii) a second subnetwork to extract a selected patch of the feature data, and (iii) a third subnetwork to generate gesture data based on a classification of the corresponding feature data of the selected patch. The first cross entropy loss is based on a first performance of the second subnetwork in relation to the training dataset. The second cross entropy loss is based on a second performance of third subnetwork in relation to the training dataset.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for a machine learning system to learn to recognize gestures, the computer-implemented method comprising: training the machine learning system using a training dataset of sensor data that include a set of gestures, the training dataset including at least a first subset that displays a first gesture and a second subset that displays a second gesture; generating loss data based on a first loss function that includes a first cross entropy loss and a second cross entropy loss; updating parameters of the machine learning system based on the loss data; and outputting the machine learning system for gesture recognition of the set of gestures, wherein, the machine learning system includes (i) a first subnetwork to generate feature data based on the sensor data, (ii) a second subnetwork to extract a selected patch of the feature data, and (iii) a third subnetwork to generate gesture data based on a classification of the corresponding feature data of the selected patch, the first cross entropy loss is based on a first performance of the second subnetwork in relation to the training dataset, and the second cross entropy loss is based on a second performance of third subnetwork in relation to the training dataset. 2. The computer-implemented method of claim 1 , wherein the first subnetwork, the second subnetwork, and the third subnetwork form an artificial neural network model that is trained end-to-end. 3. The computer-implemented method of claim 1 , further comprising: receiving additional sensor data that include samples of a new gesture, the additional sensor data being received after the machine learning system has been trained on the training dataset; training the machine learning system with the samples of the new gesture; generating additional loss data via a second loss function based on the training of the machine learning system with respect to at least the samples of the new gesture, the second loss function being different from the first loss function; and updating the parameters of the machine learning system based on the additional loss data. 4. The computer-implemented method of claim 3 , wherein the second loss function is optimized such that the new gesture is associated with new embeddings that form a new cluster in an embedding space. 5. The computer-implemented method of claim 4 , wherein the second loss function is optimized such that the new cluster of the new gesture is spaced away from at least (i) a first cluster of embeddings of the first gesture and (ii) a second cluster of embeddings of the second gesture. 6. The computer-implemented method of claim 1 , further comprising: receiving additional sensor data that include samples of the first gesture being performed by a new gesturer; training the machine learning system using the samples to adapt the machine learning system to the new gesturer; generating additional loss data via another loss function; and updating affine parameters of the machine learning system based on the additional loss data. 7. The computer-implemented method of claim 6 , further comprising: generating, via the machine learning system, embeddings based on the samples; generating output by performing affine transformations on the embeddings using the affine parameters when the machine learning system is being trained with the samples; and generating the additional loss data based on the output, wherein the another loss function is a Shannon entropy loss function. 8. A system for gesture recognition comprising: a processor; and a non-transitory computer readable medium in data communication with the processor, the non-transitory computer readable medium having computer readable data including instructions stored thereon that when executed by the processor is configured to cause the processor to perform a method that comprises: training a machine learning system using a training dataset of sensor data that include a set of gestures, the training dataset including at least a first subset that displays a first gesture and a second subset that displays a second gesture; generating loss data based on a first loss function that includes a first cross entropy loss and a second cross entropy loss; updating parameters of the machine learning system based on the loss data; and outputting the machine learning system for gesture recognition of the set of gestures, wherein, the machine learning system includes (i) a first subnetwork to generate feature data based on the sensor data, (ii) a second subnetwork to extract a selected patch of the feature data, and (iii) a third subnetwork to generate gesture data based on a classification of the corresponding feature data of the selected patch, the first cross entropy loss is based on a first performance of the second subnetwork in relation to the training dataset, and the second cross entropy loss is based on a second performance of third subnetwork in relation to the training dataset. 9. The system of claim 8 , wherein the first subnetwork, the second subnetwork, and the third subnetwork form an artificial neural network model that is trained end-to-end. 10. The system of claim 8 , wherein the method further comprises: receiving additional sensor data that include samples of a new gesture, the additional sensor data being received after the machine learning system has been trained on the training dataset; training the machine learning system with the samples of the new gesture; generating additional loss data via a second loss function based on the training of the machine learning system with respect to at least the samples of the new gesture, the second loss function being different from the first loss function; and updating the parameters of the machine learning system based on the additional loss data. 11. The system of claim 10 , wherein the second loss function is optimized such that the new gesture is associated with new embeddings that form a new cluster in an embedding space. 12. The system of claim 11 , wherein the second loss function is optimized such that the new cluster of the new embeddings of the new gesture is spaced away from at least (i) a first cluster of embeddings of the first gesture and (ii) a second cluster of embeddings of the second gesture. 13. The system of claim 10 , further comprising: receiving additional sensor data that include samples of the first gesture being performed by a new gesturer; training the machine learning system using the samples to adapt the machine learning system to the new gesturer; generating additional loss data via another loss function; and updating affine parameters of the machine learning system based on the additional loss data. 14. The system of claim 13 , further comprising: generating, via the machine learning system, embeddings based on the samples; generating output by performing affine transformations on the embeddings using the affine parameters when the machine learning system is being trained with the samples; and generating the additional loss data based on the output, wherein the another loss function is a Shannon entropy loss function. 15. A non-transitory computer readable medium having computer readable data including instructions stored thereon that, when executed by a processor, is configured to cause the processor to perform a method that comprises: training a machine learning system using a training dataset of sensor data that include a set of gestures, the training dataset including at least a first subset that displays a first gesture; generating loss da

Assignees

Bosch Gmbh Robert

Inventors

Classifications

G06V40/20
Movements or behaviour, e.g. gesture recognition (recognition of facial expressions G06V40/16) · CPC title
G06V10/764Primary
using classification, e.g. of video objects · CPC title
G06V10/82
using neural networks · CPC title
G06V10/44
Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components · CPC title
G06V10/774
Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

Patent family

Related publications grouped by family.

View patent family 91278912

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12387472B2 cover?: A computer-implemented system and method relate to gesture recognition. A machine learning system is trained using a training dataset of sensor data that include a set of gestures. The training dataset includes at least a first subset that displays a first gesture. Loss data is generated based on a first loss function that includes a first cross entropy loss and a second cross entropy loss. Par…
Who is the assignee on this patent?: Bosch Gmbh Robert
What technology area does this patent fall under?: Primary CPC classification G06V10/764. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Aug 12 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).