Font recognition using text localization

US10467508B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10467508-B2
Application numberUS-201815962514-A
CountryUS
Kind codeB2
Filing dateApr 25, 2018
Priority dateOct 6, 2015
Publication dateNov 5, 2019
Grant dateNov 5, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Font recognition and similarity determination techniques and systems are described. In a first example, localization techniques are described to train a model using machine learning (e.g., a convolutional neural network) using training images. The model is then used to localize text in a subsequently received image, and may do so automatically and without user intervention, e.g., without specifying any of the edges of a bounding box. In a second example, a deep neural network is directly learned as an embedding function of a model that is usable to determine font similarity. In a third example, techniques are described that leverage attributes described in metadata associated with fonts as part of font recognition and similarity determinations.

First claim

Opening claim text (preview).

What is claimed is: 1. In a digital medium environment to improve image font recognition through use of text localization, a method implemented by one or more computing devices comprising: obtaining a model, by the one or more computing devices, that is trained using machine learning as applied to a plurality of training images having text rendered using a corresponding font; predicting a bounding box, automatically and without user intervention by the one or more computing devices, for text in an image received using the obtained model by forming a plurality of cropped portions of the image and processing each of the plurality of cropped portions of the image by the model independently, one to another, the text overlapping a first and second cropped portion of the plurality of cropped portions; and generating an indication of the predicted bounding box by the one or more computing devices based on a result of the processing of each of the plurality of cropped portions of the image by calculating an average or a median of a top and bottom line of the predicted bounding box, the indication usable to specify a region of the image that includes the text having a font to be recognized. 2. The method as described in claim 1 , further comprising recognizing the font of the text in the received image by the one or more computing devices using the generated indication of the predicted bounding box. 3. The method as described in claim 1 , wherein the predicting includes processing each of the plurality of cropped portions of the image by a trained convolutional network of the model independently, one to another. 4. The method as described in claim 1 , wherein the predicting includes resizing the image by the one or more computing devices to correspond to an image size of the model. 5. The method as described in claim 1 , further comprising training the model by the one or more computing devices using the machine learning for a plurality of iterations. 6. The method as described in claim 5 , wherein the training is performed for at least one of the plurality of iterations using the plurality of training images having text rendered using the corresponding font and performed for one or more subsequent ones of the plurality of iterations in which one or more perturbations are introduced to the training images. 7. The method as described in claim 6 , wherein the perturbations includes at least one of noise, rotation, scale, shading, rotation, kerning, or cropping. 8. The method as described in claim 5 , wherein the machine learning is performed by the one or more computing devices using a convolutional neural network, the convolutional neural network is used as an architecture of the machine learning by the one or more computing devices and stochastic gradient decent is used as a training algorithm of the machine learning by the one or more computing devices. 9. The method as described in claim 1 , wherein the font to be recognized in the image is arbitrary such that the model is trainable without using the font. 10. In a digital medium environment to improve image font recognition through use of text localization, a system comprising: a text localization module implemented at least partially in hardware of at least one computing device to obtain a model that is trained using machine learning as applied to a plurality of training images having text rendered using a corresponding font; a machine learning module implemented at least partially in the hardware of the at least one computing device to predict a bounding box, automatically and without user intervention, for text in an image by forming a plurality of cropped portions of the image and processing each of the plurality of cropped portions of the image independently, one to another, the text overlapping a first and second cropped portion of the plurality of cropped portions; and the text localization module further implemented at least partially in the hardware of the at least one computing device to generate an indication of the predicted bounding box based on a result of the processing of each of the plurality of cropped portions of the image by calculating an average or a median of a top and bottom line of the predicted bounding box, the indication is usable to specify a region of the image that includes the text having a font to be recognized. 11. The system as described in claim 10 , wherein the font to be recognized in the image is arbitrary. 12. The system as described in claim 10 , further comprising a font similarity and recognition module implemented at least partially in the hardware of the at least one computing device to recognize the font of the text in the image. 13. The system as described in claim 10 , wherein the plurality of training images are organized as tuples to minimize a hinge loss function. 14. The system as described in claim 10 , wherein at least one training image of the plurality of training images is samples with a probability distribution that includes a normalization factor. 15. In a digital medium environment to improve image font recognition through use of text localization, a system comprising: means for obtaining a model that is trained using machine learning as applied to a plurality of training images having text rendered using a corresponding font; means for predicting a bounding box, automatically and without user intervention, for text in an image received using the obtained model by forming a plurality of cropped portions of the image and processing each of the plurality of cropped portions of the image by the model independently, one to another, the text overlapping a first and second cropped portion of the plurality of cropped portions; and means for generating an indication of the predicted bounding box based on a result of the processing of each of the plurality of cropped portions of the image by calculating an average or a median of a top and bottom line of the predicted bounding box, the indication is usable to specify a region of the image that includes the text having a font to be recognized. 16. The system as described in claim 15 , further comprising means for recognizing the font in the image. 17. The system as described in claim 15 , wherein the font to be recognized in the image is arbitrary such that the model is trainable without using the font. 18. The system as described in claim 15 , further comprising means for generating the predicted bounding box. 19. The system as described in claim 15 , wherein the plurality of training images are organized as tuples to minimize a hinge loss function. 20. The system as described in claim 15 , wherein at least one training image of the plurality of training images is samples with a probability distribution that includes a normalization factor.

Assignees

Inventors

Classifications

  • Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title

  • Classification techniques · CPC title

  • using neural networks · CPC title

  • Combinations of networks · CPC title

  • Distances to cluster centroïds · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10467508B2 cover?
Font recognition and similarity determination techniques and systems are described. In a first example, localization techniques are described to train a model using machine learning (e.g., a convolutional neural network) using training images. The model is then used to localize text in a subsequently received image, and may do so automatically and without user intervention, e.g., without specif…
Who is the assignee on this patent?
Adobe Inc
What technology area does this patent fall under?
Primary CPC classification G06V30/245. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 05 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).