Formula recognition method and apparatus

US12511892B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12511892-B2
Application numberUS-202318300031-A
CountryUS
Kind codeB2
Filing dateApr 13, 2023
Priority dateMar 25, 2021
Publication dateDec 30, 2025
Grant dateDec 30, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A formula recognition method and apparatus, a computer-readable medium, and an electronic device. The formula recognition method includes acquiring a target image including a formula, processing the target image to obtain a global image feature and a local image feature, and processing the global image feature and the local image feature to obtain the formula included in the target image.

First claim

Opening claim text (preview).

What is claimed is: 1 . A formula recognition method, performed by an electronic device, comprising: acquiring a target image comprising a formula to be recognized; processing the target image to obtain a global image feature and a local image feature, the global image feature representing a feature of the target image overall, and the local image feature representing a feature of a portion of the target image less than the target image overall; and processing the global image feature and the local image feature to obtain the formula comprised in target image, wherein the processing the global image feature and the local image feature is implemented by a decoder, and the decoder comprises: a first gate recurrent unit (GRU) layer and a second GRU layer, wherein, at a first decoding moment, input of the first GRU layer comprises: the global image feature, the local image feature, and a hidden vector obtained by the first GRU layer at a second decoding moment, wherein the second decoding moment is a previous decoding moment of the first decoding moment, and the hidden vector obtained by the first GRU layer at the second decoding moment indicates undecoded content in the global image feature and the local image feature at the first decoding moment; and input of the second GRU layer comprises: the global image feature, the local image feature, and a hidden vector outputted by the first GRU layer at the first decoding moment; and output of the decoder is a decoding result obtained by the second GRU layer at the last decoding moment. 2 . The formula recognition method according to claim 1 , wherein the processing the target image comprises: using M convolutional layers and N pooling layers of a convolutional neural network to process the target image to obtain the global image feature, wherein both M and N are integers greater than or equal to 1; and using the M convolutional layers and at least one of the N pooling layers to process the target image to obtain the local image feature. 3 . The formula recognition method according to claim 2 , wherein the convolutional neural network is a DenseNet. 4 . The formula recognition method according to claim 1 , wherein at the first decoding moment, the input of the second GRU layer further comprises a second hidden vector obtained by the second GRU layer at the second decoding moment, wherein the hidden vector obtained by the second GRU layer at the second decoding moment indicates undecoded content in the global image feature and the local image feature at the first decoding moment. 5 . The formula recognition method according to claim 1 , wherein the decoder is a decoder in a Transformer model. 6 . The formula recognition method according to claim 1 , wherein the acquiring a target image comprises: acquiring an original image comprising the formula, and removing redundant information and/or noise interference in the original image to obtain the target image. 7 . The formula recognition method according to claim 1 , further comprising: acquiring a training image comprising a training formula; and using the training image and annotation information of the training image to obtain a formula recognition model through training, the annotation information of the training image indicating the training formula comprised in the training image and the formula recognition model is configured to recognize the formula in target image. 8 . A formula recognition apparatus, comprising: at least one memory configured to store program code; at least one processor configured to read the program code and operate as instructed by the program code, the program code comprising: first acquisition code configured to cause the at least one processor to acquire a target image comprising a formula to be recognized; first processing code configured to cause the at least one processor to process the target image to obtain a global image feature and a local image feature, the global image feature representing a feature of the target image overall, and the local image feature representing a feature of a portion of the target image less than the target image overall; second processing code configured to cause the at least one processor to process the global image feature and the local image feature to obtain the formula comprised in the target image; and decoder code configured to cause the at least one processor to implement a first gate recurrent unit (GRU) layer and a second GRU layer, wherein, at a first decoding moment: input of the first GRU layer comprises: the global image feature, the local image feature, and a hidden vector obtained by the first GRU layer at a second decoding moment, wherein the second decoding moment is a previous decoding moment of the first decoding moment, and the hidden vector obtained by the first GRU layer at the second decoding moment indicates undecoded content in the global image feature and the local image feature at the first decoding moment; and input of the second GRU layer comprises: the global image feature, the local image feature, and a hidden vector outputted by the first GRU layer at the first decoding moment; and output of the decoder is a decoding result obtained by the second GRU layer at the last decoding moment. 9 . The formula recognition apparatus according to claim 8 , wherein the first processing code is configured to cause the at least one processor to: use M convolutional layers and N pooling layers of a convolutional neural network to process the target image to obtain the global image feature, wherein both M and N are integers greater than or equal to 1; and use the M convolutional layers and at least one of the N pooling layers to process the target image to obtain the local image feature. 10 . The formula recognition apparatus according to claim 9 , wherein the convolutional neural network is a DenseNet. 11 . The formula recognition apparatus according to claim 8 , wherein at the first decoding moment, the input of the second GRU layer further comprises: a hidden vector obtained by the second GRU layer at the second decoding moment, wherein the hidden vector obtained by the second GRU layer at the second decoding moment indicates undecoded content in the global image feature and the local image feature at the first decoding moment. 12 . The formula recognition apparatus according to claim 8 , wherein the decoder code is a decoder in a Transformer model. 13 . The formula recognition apparatus according to claim 8 , wherein the first acquisition code is further configured to cause the at least one processor to: acquire an original image comprising the formula, removing redundant information and/or noise interference in the original image to obtain the target image. 14 . The formula recognition apparatus according to claim 13 , wherein the program code further comprises: second acquisition code configured to cause the at least one processor to acquire a training image comprising a training formula; and training code configured to cause the at least one processor to use the training image and annotation information of the training image to obtain a formula recognition model through training, the annotation information of the training image indicating the training formula comprised in the training image and the formula recognition model is configured to recognize the formula in the target image. 15 . A non-transitory computer-readable storage medium, storing computer code that when executed by at least one processor causes the at least one processor to: acquire a target image comprising a formula;

Assignees

Inventors

Classifications

  • Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components · CPC title

  • Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation · CPC title

  • based on the type of data · CPC title

  • Combinations of networks · CPC title

  • Extraction of image or video features · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12511892B2 cover?
A formula recognition method and apparatus, a computer-readable medium, and an electronic device. The formula recognition method includes acquiring a target image including a formula, processing the target image to obtain a global image feature and a local image feature, and processing the global image feature and the local image feature to obtain the formula included in the target image.
Who is the assignee on this patent?
Beijing Sogou Tech Dev Co
What technology area does this patent fall under?
Primary CPC classification G06V10/82. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 30 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).