Representation Learning Using Multi-Task Deep Neural Networks
US-2017032035-A1 · Feb 2, 2017 · US
US10325147B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-10325147-B1 |
| Application number | US-201916290868-A |
| Country | US |
| Kind code | B1 |
| Filing date | Mar 2, 2019 |
| Priority date | Aug 3, 2017 |
| Publication date | Jun 18, 2019 |
| Grant date | Jun 18, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods of recognizing motions of an object in a video clip or an image sequence are disclosed. A plurality of frames are selected out of a video clip or an image sequence of interest. A text category is associated with each frame by applying an image classification technique with a trained deep-learning model for a set of categories containing various poses of an object within each frame. A “super-character” is formed by embedding respective text categories of the frames as corresponding ideograms in a 2-D symbol having multiple ideograms contained therein. Particular motion of the object is recognized by obtaining the meaning of the “super-character” with image classification of the 2-D symbol via a trained convolutional neural networks model for various motions of the object derived from specific sequential combinations of text categories. Ideograms may contain imagery data instead of text categories, e.g., detailed images or reduced-size images.
Opening claim text (preview).
What is claimed is: 1. A method of recognizing motions of an object in a video clip or an image sequence comprising: selecting a plurality of frames out of a video clip or an image sequence of interest; associating each of the plurality of frames of the video clip or the image sequence of interest with a particular text category selected from a set of text categories for various poses of an object in the video clip or the image sequence of interest by applying an image classification technique with a trained deep-learning model; forming a super-character by embedding respective associated text categories of the plurality of frames as corresponding ideograms in a two-dimensional (2-D) symbol having multiple ideograms contained therein and the super-character representing a meaning formed from a specific combination of said multiple ideograms; and recognizing a particular motion of the object by obtaining the meaning of the super-character with image classification of the 2-D symbol via a trained convolutional neural networks model for various motions of the object derived from specific sequential combinations of associated text categories. 2. The method of claim 1 , wherein the associated text category comprises corresponding text descriptions of the object's pose in said each of the frames. 3. The method of claim 1 , wherein the 2-D symbol being a matrix of N×N pixels of K-bit data and the matrix being divided into M×M sub-matrices with each of the sub-matrices containing (N/M)×(N/M) pixels, said each of the sub-matrices representing one ideogram, where K, N and M are positive integers, and N is a multiple of M. 4. The method of claim 3 , wherein K is 5, N is 224, M is 4, M×M is 16 and N/M is 56. 5. The method of claim 3 , wherein K is 5, N is 224, M is 8, M×M is 64 and N/M is 28. 6. The method of claim 1 , wherein the trained convolutional neural networks model comprises bi-valued 3×3 filter kernels in a Cellular Neural Networks or Cellular Nonlinear Networks (CNN) based integrated circuit. 7. The method of claim 6 , wherein the trained convolutional neural networks model is achieved with following operations: (a) obtaining a convolutional neural networks model by training the convolutional neural networks model based on image classification of a labeled dataset, which contains a number of multi-layer 2-D symbols, the convolutional neural networks model including multiple ordered filter groups, each filter in the multiple ordered filter groups containing a standard 3×3 filter kernel; (b) modifying the convolutional neural networks model by converting the respective standard 3×3 filter kernels to corresponding bi-valued 3×3 filter kernels of a currently-processed filter group in the multiple ordered filter groups based on a set of kernel conversion schemes; (c) retraining the modified convolutional neural networks model until a desired convergence criterion is met; and (d) repeating (b)-(c) for another filter group until all of the multiple ordered filter groups are converted to the bi-valued 3×3 filter kernels.
Classification techniques · CPC title
using neural networks · CPC title
Static hand or arm · CPC title
Combinations of networks · CPC title
Learning methods · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.