Object detection and image classification based optical character recognition

US11727697B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11727697-B2
Application numberUS-202117364343-A
CountryUS
Kind codeB2
Filing dateJun 30, 2021
Priority dateJan 27, 2020
Publication dateAug 15, 2023
Grant dateAug 15, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system performs optical character recognition (OCR) on an image displaying a portion of an object. An image classification system identifies the object in the image, based on which one or more object detection models identify labels associated with the object within the image. The system determines text of the identified labels using OCR, and analyzes the OCR resultant text for discrepancies and/or inaccuracies. In response to identifying a discrepancy, the system provides a recommendation for improving the accuracy of the OCR resultant text.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer implemented method for optical character recognition, comprising: receiving an image displaying one or more labels; detecting the one or more labels by providing the image as input to one or more object detection models, wherein an object detection model is configured to identify a label of a label type in an input image; for a detected label in the image: determining a text of the detected label using optical character recognition; accessing expected phrase types for phrases of a label type of the detected label; analyzing the text based on an expected phrase type to identify discrepancies in the text; responsive to identifying a discrepancy in a text of a particular label, sending a recommendation for improving the image. 2. The computer implemented method of claim 1 , further comprising: identifying a plurality of zones of the received image; for a zone from the plurality of zones, predicting a value of the text; and responsive to identifying differences in predicted values of the text, using a predicted value of the text having the highest number of occurrences as the value of the text. 3. The computer implemented method of claim 2 , the method further comprising: responsive to determining that two predicted values of the text have the highest number of occurrences, combining at least two of the plurality of zones to obtain a tiebreaker zone; and determining the value of the text to be a predicted value of the text for the tiebreaker zone. 4. The computer implemented method of claim 3 , the method further comprising: determining a size of a zone from the plurality of zones; creating multiple copies of the image; cropping each copy of the image to obtain cropped images such that each cropped image corresponds to a zone; and predicting text values for the zone corresponding to each cropped image. 5. The computer implemented method of claim 3 , the method further comprising: responsive to determining that the value of the text is not the predicted value of the text for the tiebreaker zone, generating the recommendation for improving an accuracy of the text of the label determined using optical character recognition. 6. The computer implemented method of claim 5 , wherein the recommendation comprises capturing a new image. 7. The computer implemented method of claim 1 , wherein identifying a discrepancy in a text comprises determining that an expected data type of a value in the expected phrase type fails to match an actual data type of the value in the text. 8. The computer implemented method of claim 1 , wherein identifying a discrepancy in a text comprises comparing a value in the text against a predetermined list of expected values for the text. 9. A computer readable non-transitory storage medium storing instructions that when executed by a computer processor, cause the computer processor to perform steps comprising: receiving an image displaying one or more labels; detecting the one or more labels by providing the image as input to one or more object detection models, wherein an object detection model is configured to identify a label of a label type in an input image; for a detected label in the image: determining a text of the detected label using optical character recognition; accessing expected phrase types for phrases of a label type of the detected label; analyzing the text based on an expected phrase type to identify discrepancies in the text; responsive to identifying a discrepancy in a text of a particular label, sending a recommendation for improving the image. 10. The computer readable non-transitory storage medium of claim 9 , wherein the instructions cause the computer processor to perform steps comprising: identifying a plurality of zones of the image; for a zone from the plurality of zones, predicting a value of the text; and responsive to identifying differences in predicted values of the text, using a predicted value of the text having the highest number of occurrences as the value of the text. 11. The computer readable non-transitory storage medium of claim 10 , wherein the instructions cause the computer processor to perform steps comprising: responsive to determining that two predicted values of the text have the highest number of occurrences, combining at least two of the plurality of zones to obtain a tiebreaker zone; and determining the value of the text to be a predicted value of the text for the tiebreaker zone. 12. The computer readable non-transitory storage medium of claim 11 , wherein the instructions cause the computer processor to perform steps comprising: determining a size of a zone from the plurality of zones; creating multiple copies of the image; cropping each copy of the image to obtain cropped images such that each cropped image corresponds to a zone; and predicting text values for the zone corresponding to each cropped image. 13. The computer readable non-transitory storage medium of claim 11 , wherein the instructions cause the computer processor to perform steps comprising: responsive to determining that the value of the text is not the predicted value of the text for the tiebreaker zone, generating the recommendation for improving an accuracy of the text of the label determined using optical character recognition. 14. The computer readable non-transitory storage medium of claim 13 , wherein the recommendation comprises capturing a new image. 15. The computer readable non-transitory storage medium of claim 9 , wherein identifying a discrepancy in a text comprises determining that an expected data type of a value in the expected phrase type fails to match an actual data type of the value in the text. 16. The computer readable non-transitory storage medium of claim 9 , wherein identifying a discrepancy in a text comprises comparing a value in the text against a predetermined list of expected values for the text. 17. A computer system comprising: a computer processor; and a computer readable non-transitory storage medium storing instructions that when executed by the computer processor, cause the computer processor to perform steps comprising: receiving an image displaying one or more labels; detecting the one or more labels by providing the image as input to one or more object detection models, wherein an object detection model is configured to identify a label of a label type in an input image; for a detected label in the image: determining a text of the detected label using optical character recognition; accessing expected phrase types for phrases of a label type of the detected label; analyzing the text based on an expected phrase type to identify discrepancies in the text; responsive to identifying a discrepancy in a text of a particular label, sending a recommendation for improving the image. 18. The computer system of claim 17 , wherein the instructions cause the computer processor to perform steps comprising: identifying a plurality of zones of the image; for a zone from the plurality of zones, predicting a value of the text; and responsive to identifying differences in predicted values of the text, using a predicted value of the text having the highest number of occurrences as the value of the text. 19. The computer system of claim 18 , wherein the instructions cause the computer processor to perform steps comprising: responsive to determining that two predicted values of the text have the highest number of occurrences, combining at least two of the plurality of zones to ob

Assignees

Inventors

Classifications

  • G06V20/63Primary

    Scene text, e.g. street names · CPC title

  • Classification techniques · CPC title

  • Classification of content, e.g. text, photographs or tables · CPC title

  • Character recognition · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11727697B2 cover?
A system performs optical character recognition (OCR) on an image displaying a portion of an object. An image classification system identifies the object in the image, based on which one or more object detection models identify labels associated with the object within the image. The system determines text of the identified labels using OCR, and analyzes the OCR resultant text for discrepancies …
Who is the assignee on this patent?
Salesforce Com Inc, Salesforce Inc
What technology area does this patent fall under?
Primary CPC classification G06V20/63. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 15 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).