Collaborative text detection and recognition

US9436883B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9436883-B2
Application numberUS-201514816943-A
CountryUS
Kind codeB2
Filing dateAug 3, 2015
Priority dateDec 12, 2013
Publication dateSep 6, 2016
Grant dateSep 6, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Various embodiments provide methods and systems for identifying text in an image by applying suitable text detection parameters in text detection. The suitable text detection parameters can be determined based on parameter metric feedback from one or more text identification subtasks, such as text detection, text recognition, preprocessing, character set mapping, pattern matching and validation. In some embodiments, the image can be defined into one or more image regions by performing glyph detection on the image. Text detection parameters applying to each of the one or more image regions can be adjusted based on measured one or more parameter metrics in the respective image region.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, comprising: receiving image data captured by a computing device; detecting text in the image data using at least one text detection parameter; determining that the text detected in the image data is recognized text; determining one or more parameter metrics of the image data; generating at least one modified text detection parameter based at least on the determined one or more parameter metrics and sensor data received from the computing device including ambient light, camera gain, and gyro-acceleration stability of the computing device; and applying the at least one modified text detection parameter to the image data. 2. The computer-implemented method of claim 1 , further comprising: generating, in response to a quantity of the recognized text being less than a threshold amount within a predetermined time, a notification to disable a text identification function on the computing device. 3. The computer-implemented method of claim 1 , further comprising: causing a text identification function on the computing device to be placed in a disabled state in response to a quantity of the recognized text being less than a threshold amount within a predetermined time. 4. The computer-implemented method of claim 1 , wherein the one or more parameter metrics are measured in one or more text identification subtasks, the one or more text identification subtasks including text recognition, preprocessing, character set mapping, pattern matching, or validation. 5. The computer-implemented method of claim 1 , further comprising: determining one or more image regions in the image data, each of the one or more image regions covered by at least a portion of the text detected in the image data; determining the one or more parameter metrics of the image data that correspond to each of the one or more image regions; adjusting the at least one modified text detection parameter that applies to an image region of the one or more image regions, based on the determined one or more parameter metrics that correspond to the respective image region of the one or more image regions; and applying the adjusted at least one modified text detection parameter to the respective image region of the one or more image regions. 6. The computer-implemented method of claim 5 , wherein the one or more image regions include at least one image region from a plurality of image regions comprising: maximally stable extremal regions (MSERs), Harris-affine regions, Hessian-affine regions, Kadir-Brady saliency (KBS) regions, edge-based regions (EBR), and intensity extrema and salient regions. 7. The computer-implemented method of claim 5 , wherein the one or more image regions in the image data are determined by performing glyph detection on the image data. 8. The computer-implemented method of claim 7 , further comprising: determining that one or more conditions are met, wherein the one or more conditions includes a condition that more than a threshold percentage of an image region, of the one or more image regions in the image data, is covered by a detected glyph. 9. The computer-implemented method of claim 7 , further comprising: determining that one or more conditions are met; and assigning a confidence score to the recognized text, wherein the one or more conditions includes a condition that a percentage of detected glyph area covered by the recognized text that has at least a threshold confidence score is larger than a lower limit value. 10. The computer-implemented method of claim 1 , further comprising: determining that one or more conditions are met; and determining dominant word height of the text detected in the image data, wherein the one or more conditions includes a condition that the dominant word height is less than a threshold value. 11. The computer-implemented method of claim 1 , further comprising: applying the at least one modified text detection parameter to one of subsequent image data received after the image data. 12. A non-transitory computer-readable storage medium including instructions that, when executed by at least one processor of a computing system, cause the computing system to: receive image data captured by a computing device; detect text in the image data using at least one text detection parameter; determine that the text detected in the image data is recognized text; determine one or more parameter metrics of the image data; generate at least one modified text detection parameter based at least on the determined one or more parameter metrics and sensor data received from the computing device including ambient light, camera gain, and gyro-acceleration stability of the computing device; and apply the at least one modified text detection parameter to the image data. 13. The non-transitory computer-readable storage medium of claim 12 , wherein the instructions when executed further cause the computing system to: generate, in response to a quantity of the recognized text being less than a threshold amount within a predetermined time, a notification to disable a text identification function on the computing device. 14. The non-transitory computer-readable storage medium of claim 12 , wherein the instructions when executed further cause the computing system to: cause a text identification function on the computing device to be placed in a disabled state in response to a quantity of the recognized text being less than a threshold amount within a predetermined time. 15. The non-transitory computer-readable storage medium of claim 12 , wherein the one or more parameter metrics are measured in one or more text identification subtasks, the one or more text identification subtasks including text recognition, preprocessing, character set mapping, pattern matching, or validation. 16. The non-transitory computer-readable storage medium of claim 12 , wherein the instructions when executed further cause the computing system to: determine one or more image regions in the image data, each of the one or more image regions covered by at least a portion of the text detected in the image data; determine the one or more parameter metrics of the image data that correspond to each of the one or more image regions; adjust the at least one modified text detection parameter that applies to an image region of the one or more image regions, based on the determined one or more parameter metrics that correspond to the respective image region of the one or more image regions; and apply the adjusted at least one modified text detection parameter to the respective image region of the one or more image regions. 17. The non-transitory computer-readable storage medium of claim 16 , wherein the one or more image regions in the image data are determined by performing glyph detection on the image data. 18. The non-transitory computer-readable storage medium of claim 17 , wherein the instructions when executed further cause the computing system to: determine that one or more conditions are met, wherein the one or more conditions includes a condition that more than a threshold percentage of an image region, of the one or more image regions in the image data, is covered by a detected glyph. 19. A system for identifying text in an image, comprising: at least one processor; and memory including instructions that, when executed by the at least one processor, cause the system to: receive image data captured by a computing device; detect text in the image data using at least one text detec

Assignees

Inventors

Classifications

  • Feature extraction · CPC title

  • G06V30/413Primary

    Classification of content, e.g. text, photographs or tables · CPC title

  • of printed characters having additional code marks or containing code marks · CPC title

  • using hand-held instruments; Constructional details of the instruments · CPC title

  • Character recognition · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9436883B2 cover?
Various embodiments provide methods and systems for identifying text in an image by applying suitable text detection parameters in text detection. The suitable text detection parameters can be determined based on parameter metric feedback from one or more text identification subtasks, such as text detection, text recognition, preprocessing, character set mapping, pattern matching and validation…
Who is the assignee on this patent?
A9 Com Inc
What technology area does this patent fall under?
Primary CPC classification G06V30/413. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 06 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).