Image processing methods, training methods, apparatuses, devices, media, and programs

US11334763B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11334763-B2
Application numberUS-201916700348-A
CountryUS
Kind codeB2
Filing dateDec 2, 2019
Priority dateApr 25, 2018
Publication dateMay 17, 2022
Grant dateMay 17, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An image processing method includes: inputting a to-be-processed image into a neural network; and forming discrete feature data of the to-be-processed image via the neural network, where the neural network is trained based on guidance information, and during the training process, the neural network is taken as a student neural network; the guidance information includes: a difference between discrete feature data formed by a teacher neural network for an image sample and discrete feature data formed by the student neural network for the image sample.

First claim

Opening claim text (preview).

The invention claimed is: 1. An image processing method, comprising: inputting a to-be-processed image into a neural network; and forming discrete feature data of the to-be-processed image via the neural network, the discrete feature data being a discrete feature map, wherein the neural network is trained based on guidance information, and during the training process, the neural network is taken as a student neural network; and the guidance information comprises: a difference between discrete feature data formed by a teacher neural network for an image sample and discrete feature data formed by the student neural network for the image sample, wherein the neural network is trained by: inputting an image sample into a student neural network and a teacher neural network, respectively; forming discrete feature data of the image sample via the student neural network and the teacher neural network, respectively; and performing supervised learning on the student neural network according to the guidance information. 2. The method according to claim 1 , wherein the forming discrete feature data of the to-be-processed image via the neural network comprises: forming floating-point feature data of the to-be-processed image via the neural network, and quantizing the floating-point feature data into the discrete feature data of the to-be-processed image. 3. The method according to claim 2 , wherein the forming floating-point feature data of the to-be-processed image via the neural network comprises: extracting floating-point feature data from the to-be-processed image via the neural network, and converting the extracted floating-point feature data into floating-point feature data satisfying a predetermined requirement to form the floating-point feature data of the to-be-processed image. 4. The method according to claim 3 , wherein the converting the extracted floating-point feature data into floating-point feature data satisfying a predetermined requirement comprises at least one of: converting the floating-point feature data into floating-point feature data with a predetermined number of channels; or converting the floating-point feature data into floating-point feature data with a predetermined size. 5. The method according to claim 1 , further comprising: performing corresponding vision task processing on the to-be-processed image via the neural network according to the discrete feature data of the to-be-processed image, wherein the guidance information further comprises: a difference between a vision task processing result output by the student neural network for the image sample and tagging information of the image sample. 6. The method according to claim 5 , wherein the performing corresponding vision task processing on the to-be-processed image via the neural network according to the discrete feature data of the to-be-processed image comprises: performing classification processing on the to-be-processed image via the neural network according to the discrete feature data of the to-be-processed image; or performing object detection processing on the to-be-processed image according to the discrete feature data of the to-be-processed image, wherein the guidance information further comprises: a difference between a classification processing result output by the student neural network for the image sample and classification tagging information of the image sample; or a difference between an object detection processing result output by the student neural network for the image sample and detection box tagging information of the image sample. 7. The method according to claim 1 , wherein the training process of using the neural network as a student neural network further comprises: performing vision task processing on the image sample via the student neural network according to the discrete feature data of the image sample; the performing supervised learning on the student neural network according to guidance information comprises: performing supervised learning on the student neural network by using, as guidance information, the difference between the discrete feature data formed by the teacher neural network for the image sample and the discrete feature data formed by the student neural network for the image sample and a difference between a vision task processing result output by the student neural network and tagging information of the image sample. 8. The method according to claim 1 , wherein the teacher neural network comprises: a successfully trained floating-point teacher neural network configured to form floating-point feature data for an input image, and perform vision task processing on the input image according to the floating-point feature data; and a quantization auxiliary unit configured to convert the floating-point feature data formed by the floating-point teacher neural network into discrete feature data, and provide the discrete feature data to the floating-point teacher neural network, so that the floating-point teacher neural network performs vision task processing on the input image according to the discrete feature data. 9. The method according to claim 8 , wherein a process of training the teacher neural network comprises: inputting an image sample into a successfully trained floating-point teacher neural network; extracting floating-point feature data of the image sample via the successfully trained floating-point teacher neural network, converting the floating-point feature data into discrete feature data via the quantization auxiliary unit, and performing vision task processing on the image sample via the successfully trained floating-point teacher neural network according to the discrete feature data; and performing network parameter adjustment on the successfully trained floating-point teacher neural network by using a difference between the vision task processing result and tagging information of the image sample as guidance information. 10. The method according to claim 8 , wherein a process of training the floating-point teacher neural network comprises: inputting an image sample into a to-be-trained floating-point teacher neural network; extracting floating-point feature data of the image sample via the to-be-trained floating-point teacher neural network, and performing vision task processing on the image sample according to the floating-point feature data; and performing supervised learning on the to-be-trained floating-point teacher neural network by using a difference between the vision task processing result and tagging information of the image sample as guidance information. 11. A non-transitory computer-readable storage medium having stored thereon computer-readable instructions that, when executed by a processor, cause the processor to implement the method according to claim 1 . 12. A mobile terminal implementing the method according to claim 1 , wherein: the student neural network is trained by using the teacher neural network forming discrete feature data, such that the knowledge of the teacher neural network can be transferred to the student neural network, and the network parameters of the student neural network are not limited to fixed-point network parameters; the student neural network is configured to perform floating-point arithmetic, such that after the student neural network is successfully trained, the neural network is not be limited by a specific instruction set and a specific device, thereby facilitating improvement in an application range of the neural network; and the floating feature data obtained by the floating-point arithmetic are converted into discrete feature data by quantization and mainta

Assignees

Inventors

Classifications

  • using neural networks · CPC title

  • Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title

  • using classification, e.g. of video objects · CPC title

  • characterised by the process organisation or structure, e.g. boosting cascade · CPC title

  • G06N3/08Primary

    Learning methods · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11334763B2 cover?
An image processing method includes: inputting a to-be-processed image into a neural network; and forming discrete feature data of the to-be-processed image via the neural network, where the neural network is trained based on guidance information, and during the training process, the neural network is taken as a student neural network; the guidance information includes: a difference between dis…
Who is the assignee on this patent?
Beijing Sensetime Tech Development Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 17 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).