Apparatus, method, and computer readable medium for recognizing text on a curved surface
US-9213911-B2 · Dec 15, 2015 · US
US9911361B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9911361-B2 |
| Application number | US-201314136876-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 20, 2013 |
| Priority date | Mar 10, 2013 |
| Publication date | Mar 6, 2018 |
| Grant date | Mar 6, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An apparatus is provided for audibly reading text retrieved from a captured image. In one implementation, the apparatus comprises an image sensor configured to capture image data from an environment of a user, and at least one processor. The processor is configured to determine an existence of a pointing trigger in the image data, the trigger being associated with a user's desire to hear text read aloud, and wherein the trigger identifies an intermediate portion of the text a distance from a level break in the text. The processor is further configured to perform a layout analysis on the text to identify a level break associated with the trigger; and cause the text to be read aloud from the level break associated with the trigger.
Opening claim text (preview).
What is claimed is: 1. An apparatus for audibly reading text retrieved from a captured image, the apparatus comprising: an image sensor configured to capture image data from an environment of a user; and at least one processor configured to: determine an existence of a trigger in the image data, the trigger being associated with a user's desire to hear text read aloud, and wherein the trigger identifies an intermediate portion of the text a distance from a level break in the text; perform a layout analysis on the text to identify a level break associated with the trigger; perform an optical character recognition (OCR) only on a subset of the text in the image data associated with the trigger prior to causing the subset of the text to be read aloud; cause the subset of the text to be read aloud from the level break associated with the trigger; and while the subset of the text is being read aloud, anticipate a subsequent subset of the text to be read aloud and perform an OCR of the subsequent subset of the text in advance. 2. The apparatus of claim 1 , wherein the trigger includes an identification of text within a specific paragraph and wherein the level break is a beginning of a sequential paragraph associated with the specific paragraph. 3. The apparatus of claim 1 , wherein the trigger includes an identification of text within a specific paragraph and wherein the level break is at least one of the following: a beginning of the specific paragraph, a beginning of a specific sentence, a beginning of a specific column, a beginning of a specific page, an end of the specific paragraph, an end of the specific sentence, an end of the specific column, and an end of the specific page. 4. The apparatus of claim 1 , wherein the trigger includes an identification of at least two intermediate portions of the text each a distance from a different level break in the text, and wherein the at least one processor device is further configured to select a level break associated with the trigger and cause the text to be read aloud from the selected level break. 5. The apparatus of claim 4 , wherein the at least one processor device is further configured to select a level break based on context information. 6. The apparatus of claim 1 , wherein the at least one processor device is further configured to begin reading aloud the text prior to completion of a full OCR of the text, and to continue performance of the OCR while reading aloud is occurring. 7. The apparatus of claim 6 , wherein the at least one processor device is further configured to begin reading aloud within less than 4 seconds of initiation of the OCR. 8. The apparatus of claim 6 , wherein the at least one processor device is further configured to begin reading aloud within less than 3 seconds of initiation of the OCR. 9. The apparatus of claim 6 , wherein the at least one processor device is further configured to begin reading aloud within less than 1 second of initiation of the OCR. 10. The apparatus of claim 1 , wherein the image sensor is further configured to capture the image data in various resolutions. 11. The apparatus of claim 10 , wherein the at least one processor device is further configured to operate in a low power consumption mode by performing the layout analysis on image data taken at a resolution lower than a resolution of the image data used for performing the OCR. 12. A method for audibly reading text retrieved from a captured image, the method comprising: capturing real time image data from an environment of a user; determining an existence of a trigger in the image data, the trigger being associated with a desire of the user to hear text read aloud, and wherein the trigger identifies an intermediate portion of the text a distance from a level break in the text; performing a layout analysis on the text to identify the level break associated with the trigger; performing an optical character recognition (OCR) only on a subset of the text in the image data associated with the trigger prior to causing the subset of the text to be read aloud; reading aloud the subset of the text beginning from the level break associated with the trigger; and while the subset of the text is being read aloud, anticipating a subsequent subset of the text to be read aloud and performing an OCR of the subsequent subset of the text in advance. 13. A software product stored on a non-transitory computer readable medium and comprising data and computer implementable instructions for carrying out the method of claim 12 .
including functional features of a camera · CPC title
using audible presentation of the information · CPC title
Devices or methods enabling eye-patients to replace direct visual perception by another kind of perception · CPC title
Arrangements for interaction with the human body, e.g. for user immersion in virtual reality (blind teaching G09B21/00) · CPC title
using visual presentation of the information for the partially sighted · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.