Using a front-facing camera to improve OCR with a rear-facing camera

US9269009B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9269009-B1
Application numberUS-201414283115-A
CountryUS
Kind codeB1
Filing dateMay 20, 2014
Priority dateMay 20, 2014
Publication dateFeb 23, 2016
Grant dateFeb 23, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Various embodiments enable a computing device to incorporate frame selection or preprocessing techniques into a text recognition pipeline in an attempt to improve text recognition accuracy in various environments and situations. For example, a mobile computing device can capture images of text using a first camera, such as a rear-facing camera, while capturing images of the environment or a user with a second camera, such as a front-facing camera. Based on the images captured of the environment or user, one or more image preprocessing parameters can be determined and applied to the captured images in an attempt to improve text recognition accuracy.

First claim

Opening claim text (preview).

What is claimed is: 1. A non-transitory computer-readable storage medium storing instructions that, when executed by at least one processor, cause a computing device to: acquire, using a rear-facing camera of the computing device, a plurality of image frames; acquire, using a front-facing camera of the computing device, information corresponding to an environment of the computing device; determine one or more conditions of the environment of the computing device using the information acquired by the front-facing camera of the computing device; select, using at least one frame selection parameter associated with the one or more conditions, a first frame of the plurality of image frames for processing by an optical character recognition (OCR) engine in electronic communication with the computing device; determine at least one threshold for performing binarization of the first frame, wherein the at least one threshold is based on the one or more conditions; binarize, using the at least one threshold, at least a portion of the first frame; and cause the at least a portion of the binarized first frame to be processed using the OCR engine. 2. The non-transitory computer-readable storage medium of claim 1 , wherein the instructions that, when executed by the at least one processor, further cause the computing device to binarize the at least a portion of the first frame by: determining a first portion of the first frame having pixel values above a character threshold value to be a character portion; and determining a second portion of the first frame having pixel values below a background threshold value to be a background portion. 3. The non-transitory computer-readable storage medium of claim 1 , wherein the instructions that, when executed by the at least one processor, further cause the computing device to select the first frame by: determining a focus measure value or a contrast measure value. 4. The non-transitory computer-readable storage medium of claim 1 , wherein the instructions that, when executed by the at least one processor, further cause the computing device to acquire information corresponding to an environment of the computing device by: acquiring an image with the front-facing camera; identifying an object in the image acquired by the front-facing camera; and comparing, using an object matching algorithm, the object to objects stored in a database. 5. A computer-implemented method, comprising: under the control of one or more computer systems configured with executable instructions, acquiring, using a first camera of a computing device, at least one first image; acquiring, using a second camera of the computing device, information corresponding to an environment of the computing device, wherein the second camera faces a different direction than the first camera; determine one or more conditions of the environment using the information acquired by the second camera of the computing device; determining at least one parameter associated with the one or more conditions; performing at least one preprocessing operation associated with the at least one first image, wherein the at least one preprocessing operation includes binarizing at least a portion of each of the at least one first image based upon the one or more conditions; and causing the at least one first image to be processed using an optical character recognition (OCR) engine in electronic communication with at least one of the one or more computer systems, wherein (i) the at least one parameter is used when performing the preprocessing operation or (ii) the at least one parameter is used by the OCR engine. 6. The computer-implemented method of claim 5 , wherein the at least one preprocessing operation further includes: determining, based at least in part on the at least one parameter, a background threshold value or a character threshold value for text of the at least one first image, and wherein the binarizing uses the background threshold and the character threshold. 7. The computer-implemented method of claim 6 , further comprising: determining that the environment is associated with at least one of high lighting conditions or a high contrast measure based at least in part on the at least one first image; causing, based on a determination that the environment is associated with at least one of high lighting conditions or a high contrast measure based at least in part on the at least one first image, the at least one parameter to include a first focus measure, wherein the first focus measure is lower than a second focus measure, the second focus measure associated with low lighting conditions or image frames having a low contrast measure; and causing, based on the determination, the background threshold to be defined at a first intensity value and the character threshold to be defined at a second intensity value, wherein the first intensity value and second intensity value are separated by a first difference, wherein the first difference is greater than a second difference, the second difference associated with low lighting conditions or image frames having low contrast measure. 8. The computer-implemented method of claim 5 , further comprising: determining, based on the at least one first image, that the environment is associated with at least one of a lowlighting condition or a low contrast measure; and causing, based on a determination that the environment is associated with at least one of a lowlighting condition or a low contrast measure based at least in part on the at least one first image, the at least one parameter to include a first focus measure, wherein the first focus measure is higher than a second focus measure, the second focus measure associated with a high lighting condition or high contrast environment. 9. The computer-implemented method of claim 8 , further comprising: causing, based on a determination that the environment is associated with at least one of a lowlighting condition or a low contrast measure based at least in part on the at least one first image, a set of additional images to be processed relative to the low lighting condition or the low contrast environment. 10. The computer-implemented method of claim 5 , further comprising: analyzing the information corresponding to the environment to identify objects captured by the second camera; and comparing, using an object matching algorithm, the objects captured by the second camera to objects stored in a database, wherein at least a portion of the objects stored in the database are associated with one of a plurality of environments. 11. The computer-implemented method of claim 5 , wherein the information corresponding to the environment captured by the second camera is a representation of a face and the at least one parameter indicates a facial expression. 12. The computer-implemented method of claim 11 , further comprising: displaying a first recognition result from the OCR engine for the at least one first image; determining, based at least in part on a first facial expression, dissatisfaction with the first recognition result; acquiring, using the first camera, at least one second image; and causing the at least one second image to be processed using the OCR engine. 13. The computer-implemented method of claim 12 , further comprising: displaying a second recognition result for the at least one second image; acquiring, using the second camera, information corresponding to a second facial expression; and determining, based at least in part on the second facial expression, satisfaction with the second recognition result. 14. The computer-

Assignees

Inventors

Classifications

  • G06V30/127Primary

    with the intervention of an operator · CPC title

  • of printed characters having additional code marks or containing code marks · CPC title

  • Character recognition · CPC title

  • G06K9/18Primary

    Physics · mapped topic

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9269009B1 cover?
Various embodiments enable a computing device to incorporate frame selection or preprocessing techniques into a text recognition pipeline in an attempt to improve text recognition accuracy in various environments and situations. For example, a mobile computing device can capture images of text using a first camera, such as a rear-facing camera, while capturing images of the environment or a use…
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06V30/127. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 23 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).