Method and system for multi-scale cell image segmentation using multiple parallel convolutional neural networks
US-2019236411-A1 · Aug 1, 2019 · US
US10896357B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-10896357-B1 |
| Application number | US-201715858976-A |
| Country | US |
| Kind code | B1 |
| Filing date | Dec 29, 2017 |
| Priority date | Dec 29, 2017 |
| Publication date | Jan 19, 2021 |
| Grant date | Jan 19, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Key/Value pairs, each comprising a keyword string and an associated value, are extracted automatically from a document image. Each document image has a plurality of pixels with each pixel having a plurality of bits. A first subset of the plurality of bits for each pixel represents information corresponding to the document image. The document image is processed to add information to a second subset of the plurality of bits for each pixel. The information added to the second subset alters the appearance of the document image in a manner that facilitates semantic recognition of textually encoded segments within the document image by a Deep Neural Network (DNN) trained to recognize images within image documents. The DNN detects groupings of text segments within detected spatial templates within the document image. The text segments are mapped to known string values to generate the keyword strings and associated values.
Opening claim text (preview).
What is claimed is: 1. A computerized method for identifying keyword strings and associated values from a document image, comprising: receiving the document image from a document storage, wherein the document image comprises a plurality of pixels and wherein each pixel within the document image is represented by a plurality of bits contained in a computer system storage; modifying the document image, from a first version of the document image to a second version of the document image; allocating for each pixel of the plurality of pixels, a first subset of bits representing information corresponding to the second version of the document image; and allocating for each pixel, a second subset of bits wherein the second subset of bits does not overlap with the first subset of bits, and setting the value of each bit within the second subset of bits for each pixel, to represent the second version of the document image in a manner selected to facilitate recognition of textually encoded segments within the document image by a deep neural network trained to recognize objects within an image; detecting by the deep neural network, groupings of text segments in the document image, wherein each grouping of text segments in the document image is associated with a spatial template; and mapping the text segments in the groupings of text segments in the document image to known string values to identify the keyword strings and associated values, wherein the keyword string is representative of semantic meaning of a grouping of text segments. 2. The computerized method of claim 1 wherein the known string values correspond to a known domain associated with the document images. 3. The computerized method of claim 1 wherein modifying the document image from a first version of the document image to a second version of the document image further comprises: processing the document image to recognize textually encoded segments; and annotating the textually encoded segments with a probability value indicative of probability of the text segments representing a known keyword string; and wherein setting the value of each bit within the second subset of bits for each pixel corresponds to the annotated probability value indicative of the probability of the text segments representing a known keyword string. 4. The computerized method of claim 1 wherein processing the document image to modify the bits for each pixel, to allocate for each pixel, a second subset of bits wherein the second subset of bits does not overlap with the first subset of bits, and to add information to the second subset of bits for each pixel, to alter the appearance of the document image in a manner selected to facilitate semantic recognition of textually encoded segments within the document image by a deep neural network trained to recognize objects within an image comprises: blurring the document image by joining neighboring characters in a line and across successive lines in the document image, and converting the document image to mimic a natural image. 5. The computerized method of claim 1 wherein processing the document image to modify the bits for each pixel, to allocate for each pixel, a second subset of bits wherein the second subset of bits does not overlap with the first subset of bits, and to add information to the second subset of bits for each pixel, to alter the appearance of the document image in a manner selected to facilitate semantic recognition of textually encoded segments within the document image by a deep neural network trained to recognize objects within an image comprises: removing backgrounds, patterns, and lines from the document image to generate a noise free, uniform font rendering of the document image. 6. The computerized method of claim 1 wherein processing the document image to modify the bits for each pixel, to allocate for each pixel, a second subset of bits wherein the second subset of bits does not overlap with the first subset of bits, and to add information to the second subset of bits for each pixel, to alter the appearance of the document image in a manner selected to facilitate semantic recognition of textually encoded segments within the document image by a deep neural network trained to recognize objects within an image comprises: adding regular gaussian distributed noise to the document image. 7. The computerized method of claim 1 wherein processing the document image to modify the bits for each pixel, to allocate for each pixel, a second subset of bits wherein the second subset of bits does not overlap with the first subset of bits, and to add information to the second subset of bits for each pixel, to alter the appearance of the document image in a manner selected to facilitate semantic recognition of textually encoded segments within the document image by a deep neural network trained to recognize objects within an image comprises: processing the document image to recognize textually encoded segments; and processing the textually encoded segments in accordance with a list of keyword strings, each of which has associated therewith an occurrence frequency indicative of occurrence frequency of the keyword string within a domain associated with the document image. 8. The computerized method of claim 1 further comprising: splitting the document image into a plurality of overlapping sub-images after processing the document image to add information to a second subset of the plurality of bits for each pixel; and wherein detecting by the deep neural network, groupings of text segments within detected spatial templates within the document image, is performed separately, for each of the sub-images; and wherein the detected groupings of text segments and associated spatial templates in each of the sub-images are joined before mapping the text segments to known string values to generate the keyword strings and associated values. 9. The computerized method of claim 1 wherein the deep neural network detects a plurality of spatial templates for certain of the groupings of text segments, the method further comprising merging the spatial templates for each grouping of text segments to generate a single merged spatial template for each grouping of text segments. 10. The computerized method of claim 9 wherein merging the spatial templates is performed in accordance with a non-maximum suppression algorithm. 11. The computerized method of claim 9 further comprising removing spatial templates characterized by low-confidence. 12. The computerized method of claim 1 wherein mapping the text segments in the groupings of text segments to known string values to generate the keyword strings and associated values, comprises accessing a mapping of known key string values to a semantic key value to associate each keyword string with a value associated with the keyword string. 13. The computerized method of claim 12 further comprising receiving user selection of semantic key values. 14. The computerized method of claim 1 wherein the first subset of bits for each pixel comprises a single bit in the plurality of bits for a pixel and wherein the second subset of pixels for each pixel comprises any additional bits in the plurality of bits for the pixel. 15. A document processing system comprising: data storage for storing a plurality of document images, wherein the document image comprises a plurality of pixels and wherein each pixel within the document image is comprised of a plurality of bits; and a processor operatively coupled to the data storage and configured to execute instructions that when executed cause the processor to: process th
Validation; Performance evaluation · CPC title
Syntactic or semantic context, e.g. balancing · CPC title
Validation; Performance evaluation; Active pattern learning techniques · CPC title
Character recognition · CPC title
Extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics or text · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.