Who is the assignee on this patent?

Microsoft Technology Licensing Llc

What technology area does this patent fall under?

Primary CPC classification G06F3/017. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 07 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Ultrasonic based gesture recognition

US10528147B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10528147-B2
Application number	US-201715640327-A
Country	US
Kind code	B2
Filing date	Jun 30, 2017
Priority date	Mar 6, 2017
Publication date	Jan 7, 2020
Grant date	Jan 7, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An ultrasonic gesture recognition system is provided that recognizes gestures based on analysis of return signals of an ultrasonic pulse that is reflected from a gesture. The system transmits an ultrasonic chirp and samples a microphone array at sample intervals to collect a return signal for each microphone. The system then applies a beamforming technique to frequency domain representations of the return signals to generate an acoustic image with a beamformed return signal for multiple directions. The system then generates a feature image from the acoustic images to identify, for example, distance or depth from the microphone array to the gesture for each direction. The system then submits the feature image to a deep learning system to classify the gesture.

First claim

Opening claim text (preview).

We claim: 1. A method performed by a computing device for recognizing a gesture, the method comprising: transmitting via a transmitter an ultrasonic chirp to receive time domain return signals from the gesture; for each of a plurality of receivers, receiving the time domain return signal by sampling that receiver at sample intervals; and converting the time domain return signal to a frequency domain return signal; generating an acoustic image with a beamformed frequency return signal for each of a plurality of directions, the generating of the acoustic image including performing a beamforming of the frequency domain return signals to generate a beamformed frequency domain return signal for each direction; generating a feature image for a feature with a feature value for each direction from the acoustic image; and submitting the feature image to a classifier to classify the gesture, wherein the classifier is a convolutional neural network (“CNN”) including multiple convolution layers, a fully connected layer, a long short-term memory (“LSTM”) layer, a softmax layer, and a mean pooling layer. 2. The method of claim 1 wherein multiple chirps are transmitted in sequence and a sequence of feature images is generated with one feature image for each chirp and wherein the submitting includes submitting the sequence of feature images to the classifier to classify a dynamic gesture. 3. The method of claim 1 wherein multiple sequences of feature images are generated for multiple features and the classifier includes a convolution layer for each feature, and wherein the fully connected layer is fully connected to the convolution layers. 4. The method of claim 1 wherein the feature is selected from a group consisting of depth and intensity. 5. The method of claim 1 wherein the beamforming uses a noise covariance matrix for the receivers that was generated prior to the transmitting of the chirp. 6. The method of claim 1 wherein the generating of the acoustic image includes performing a match filtering of the beamformed frequency domain return signals and the transmitted chirp. 7. A computing system for recognizing a gesture, the computing system comprising: one or more computer-readable storage mediums storing computer-executable instructions of: a component that accesses, for each of a plurality of ultrasonic pulses, time domain return signals that are reflections from the gesture of that transmitted ultrasonic pulse, each time domain return signal having been received by a receiver; a beamforming component that, for each ultrasonic pulse and each direction, performs beamforming to generate, from frequency domain return signals corresponding to the time domain return signals, a beamformed frequency domain return signal for that pulse and direction; a feature extraction component that, for each ultrasonic pulse, extracts from the beamformed frequency domain return signals a feature value for a feature for each direction; and a classifier component that receives the feature values for each ultrasonic pulse, recognizes the gesture based on the feature values, and outputs an indication of the recognized gesture, wherein the classifier component implements a convolutional neural network (“CNN”) including multiple convolution layers, a fully connected layer, a long short-term memory (“LSTM”) layer, a softmax layer, and a mean pooling layer; and one or more processors for executing the computer-executable instructions stored in the one or more computer-readable storage mediums. 8. The computing system of claim 7 wherein the feature extraction component extracts feature values for multiple features and the CNN includes a convolution layer for each feature, and wherein the fully connected layer is fully connected to the convolution layers. 9. The computing system of claim 7 wherein the frequency of the ultrasonic pulse varies. 10. A method performed by a computing system for recognizing a gesture, the method comprising: transmitting an ultrasonic pulse; receiving, at each of a plurality of receivers, a return signal from the ultrasonic pulse; performing beamforming based on the return signals to generate a beamformed return signal for each of a plurality of directions; generating a feature value for each beamformed return signal; and applying a classifier to the feature values to classify the gesture, wherein the classifier component implements a convolutional neural network (“CNN”) including multiple convolution layers, a fully connected layer, a long short-term memory (“LSTM”) layer, a softmax layer, and a mean pooling layer. 11. The method of claim 10 wherein the classifier is performed by a deep learning system. 12. A deep learning system for recognizing a gesture from feature images generated from return signals of ultrasonic pulses reflected from the gesture, the deep learning system comprising: a first convolution layer that inputs feature images in sequence and outputs generated first features for each feature image, the first convolution layer including a first convolution sublayer, a first rectified linear unit (“ReLU”) sublayer, and a first max pooling sublayer; a second convolution layer that inputs the first features in sequence and outputs generated second features for each feature image, the second convolution layer including a second convolution sublayer, a second ReLU sublayer, and a second max pooling layer; a fully connected layer that inputs the second features in sequence and outputs generated third features for each feature image; a long short-term memory layer that inputs the third features in sequence and outputs fourth features for each feature image; a softmax layer that inputs the fourth features in sequence and outputs probabilities of classifications for the sequence of feature images; and a max pooling layer that inputs probabilities of the classification and outputs an indication of the classification for the sequence of feature images.

Assignees

Microsoft Technology Licensing Llc

Inventors

Classifications

G06F3/017Primary
Gesture based interaction, e.g. based on a set of recognized hand gestures (interaction based on gestures traced on a digitiser G06F3/04883) · CPC title
G01S7/527Primary
Extracting wanted echo signals {(Doppler systems G01S15/50)} · CPC title
G06V40/20
Movements or behaviour, e.g. gesture recognition (recognition of facial expressions G06V40/16) · CPC title
G06V10/82
using neural networks · CPC title
G06V10/764
using classification, e.g. of video objects · CPC title

Patent family

Related publications grouped by family.

View patent family 63447899

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10528147B2 cover?: An ultrasonic gesture recognition system is provided that recognizes gestures based on analysis of return signals of an ultrasonic pulse that is reflected from a gesture. The system transmits an ultrasonic chirp and samples a microphone array at sample intervals to collect a return signal for each microphone. The system then applies a beamforming technique to frequency domain representations of…
Who is the assignee on this patent?: Microsoft Technology Licensing Llc
What technology area does this patent fall under?: Primary CPC classification G06F3/017. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 07 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).