Data normalization for handwriting recognition

US10025976B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-10025976-B1
Application numberUS-201615393056-A
CountryUS
Kind codeB1
Filing dateDec 28, 2016
Priority dateDec 28, 2016
Publication dateJul 17, 2018
Grant dateJul 17, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed herein is a method of optimizing data normalization by selecting the best height normalization setting from training RNN (Recurrent Neural Network) with one or more datasets comprising multiple sample images of handwriting data, which comprises estimating a few top place ratios for normalization by minimizing a cost function for any given sample image in the training dataset, and further, determining the best ratio from the top place ratios by validating the recognition results of sample images with each top place ratio.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of optimizing normalization for handwriting recognition, comprising: obtaining an image comprising handwriting data corresponding to at least one word; pre-processing the obtained image to produce a pre-processed image comprising multiple pixels; normalizing a height of the pre-processed image to generate a normalized image by: calculating a vertical histogram for the pre-processed image, detecting a middle height of the vertical histogram, increasing a major component height from the middle height until a pre-determined count of pixels out of the multiple pixels of the pre-processed image are covered, calculating a ratio between a pre-determined fixed height and the major component height, and zooming in or out of the pre-processed image at the calculated ratio to generate the normalized image; sending the normalized image and a normalization setting to RNN (Recurrent Neural Network); and obtaining a recognition result from the RNN for the normalized image based on the normalization setting. 2. The method of claim 1 , wherein the step of pre-processing the obtained image comprises: removing noises from the obtained image to generate noise-filtered image data; gray scaling the noise-filtered image data to generated gray-scaled data; and binarizing the gray-scaled data to generate the multiple pixels of the pre-processed image. 3. The method of claim 1 , wherein the normalization setting is determined by training the RNN with a dataset comprising multiple sample images of handwriting data. 4. The method of claim 3 , wherein training the RNN with the dataset comprises: determining a plurality of candidate ratios for normalization, each candidate ratio representing an optimal ratio for at least one of the multiple sample images of the dataset; and selecting a ratio from the plurality candidate ratios, the selected ratio representing the best ratio for normalization based on a validation of recognition results for each sample image of the dataset. 5. The method of claim 4 , wherein the step of determining the plurality of candidate ratios further comprises selecting the candidate ratios from a number of unprocessed ratios, each candidate ratio representing an optimal ratio for at least one sample image in the dataset. 6. The method of claim 5 , wherein selecting the candidate ratios further comprises: selecting a first ratio from the unprocessed ratios; for each sample image in the dataset, performing a CF (Cost Function) minimization process based on the first ratio to identify a first number of sample images for which the first ratio is an optimal ratio; selecting a second ratio from the number of unprocessed ratios; for each sample image in the dataset, performing the CF (Cost Function) minimization process based on the second ratio to identify a second number of sample images for which the second ratio is an optimal ratio; and based on a comparison of the first and second numbers of sample images, determining one of the first and second ratios to be the candidate ratio. 7. The method of claim 6 , wherein selecting the candidate ratios further comprises: selecting a third ratio from the unprocessed ratios; for each sample image in the dataset, performing the CF (Cost Function) minimization process based on the third ratio to identify a third number of sample images for which the third ratio is an optimal ratio; and based on a comparison of the first, second and third numbers of sample images, determining one or two of the first, second and third ratios to be the candidate ratio. 8. The method of claim 6 , wherein the CF minimization process comprises: obtaining the sample image and its corresponding GT (Ground Truth); pre-processing the obtained sample image to generate a pre-processed sample image; normalizing a height of the pre-processed sample image to generate a normalized sample image; calculating a number of blocks for the normalized sample image, each block having a width and height of a selected ratio; calculating a squared error of the number of blocks; and calculating a CF (Cost Function) based on the number of blocks and GT of the sample image; and minimizing the CF for the dataset. 9. The method of claim 4 , wherein the step of selecting the best ratio from the plurality candidate ratios comprises: selecting a first ratio from the plurality of candidate ratios; for each sample image of the dataset, performing a validation process comprising: obtaining the sample image and its corresponding GT (Ground Truth), pre-processing the obtained sample image to generate a pre-processed sample image, normalizing a height of the pre-processed sample image to generate a normalized sample image, inputting the normalized sample image and the selected first ratio to RNN to obtain a recognition result, validating the recognition result with the GT of the sample image to generate an error, and determining that the error falls below a pre-determined threshold; and determining the first ratio to be the best ratio. 10. A computer program product comprising a computer usable non-transitory medium having a computer readable program code embedded therein for controlling a data processing apparatus, the computer readable program code configured to cause the data processing apparatus to execute a process for optimizing data normalization for handwriting recognition, the process comprising: obtaining an image comprising handwriting data corresponding to at least one word; pre-processing the obtained image to produce a pre-processed image comprising multiple pixels; normalizing a height of the pre-processed image to generate a normalized image by: calculating a vertical histogram for the pre-processed image, detecting a middle height of the vertical histogram, increasing a major component height from the middle height until a pre-determined count of pixels out of the multiple pixels of the pre-processed image are covered, calculating a ratio between a pre-determined fixed height and the major component height, and zooming in or out of the pre-processed image at the calculated ratio to generate the normalized image; sending the normalized image and a normalization setting to RNN (Recurrent Neural Network); and obtaining a recognition result from the RNN for the normalized image based on the normalization setting. 11. The computer program product of claim 10 , wherein the step of pre-processing the obtained image comprises: removing noises from the obtained image to generate noise-filtered image data; gray scaling the noise-filtered image data to generated gray-scaled data; and binarizing the gray-scaled data to generate the multiple pixels of the pre-processed image. 12. The computer program product of claim 10 , wherein the normalization setting is determined by training the RNN with a dataset comprising multiple sample images of handwriting data. 13. The computer program product of claim 12 , wherein training the RNN with the dataset comprises: determining a plurality of candidate ratios for normalization, each candidate ratio representing an optimal ratio for at least one of the multiple sample images of the dataset; and selecting a ratio from the plurality candidate ratios, the selected ratio representing the best ratio for normalization based on a validation of recognition results for each sample image of the dataset. 14. The computer program product of claim 13 , wherein the step of determining the plurality of candidate ratios further comprises selecting the candidate ratios from a number of unprocessed ratios, each candidate r

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10025976B1 cover?
Disclosed herein is a method of optimizing data normalization by selecting the best height normalization setting from training RNN (Recurrent Neural Network) with one or more datasets comprising multiple sample images of handwriting data, which comprises estimating a few top place ratios for normalization by minimizing a cost function for any given sample image in the training dataset, and furt…
Who is the assignee on this patent?
Konica Minolta Laboratory Usa Inc
What technology area does this patent fall under?
Primary CPC classification G06K9/00409. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 17 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).