Method and apparatus for extracting areas

US10255261B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10255261-B2
Application numberUS-201715424495-A
CountryUS
Kind codeB2
Filing dateFeb 3, 2017
Priority dateMar 31, 2016
Publication dateApr 9, 2019
Grant dateApr 9, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A processor obtains a table that contains numerical values or character strings in its cells. The processor then replaces each numerical value with a first constant value, and each character string with a second constant value. The two constant values have opposite signs. The processor generates area datasets each including first to third rectangular areas. The right side of the second rectangular area coincides with the left side of the first rectangular area. The bottom side of the third rectangular area coincides with the top side of the first rectangular area. With respect to each generated area dataset, the processor compares a sum of first and second constant values in the first rectangular area with a sum of first and second constant values in the second and third rectangular areas. The processor outputs at least one of the area datasets according to the comparison result.

First claim

Opening claim text (preview).

What is claimed is: 1. A non-transitory computer-readable storage medium storing a program, wherein the program causes processor circuitry in a computer to perform a procedure comprising: obtaining a table dataset representing a table and storing the obtained table dataset in memory circuitry in the computer, the table having a plurality of cells arranged in rectangular form, at least some of the plurality of cells having numerical values or character strings; replacing each of the numerical values in the table dataset with a first constant value, and each of the character strings in the table dataset with a second constant value whose sign is opposite to a sign of the first constant value; generating a plurality of area datasets each including first to third rectangular areas mapped on the plurality of cells, the first rectangular area having a top side and a left side, the second rectangular area having a right side that coincides with the left side of the first rectangular area, the third rectangular area having a bottom side that coincides with the top side of the first rectangular area, the plurality of area datasets respectively including different first rectangular areas; calculating a difference between a first sum and a second sum with respect to each of the generated area datasets, the first sum being a sum of first constant values and second constant values contained in cells of the first rectangular area, the second sum being a sum of first constant values and second constant values contained in cells of the second rectangular area and in cells of the third rectangular area; identifying a difference-maximizing area dataset among the plurality of area datasets, the difference-maximizing area dataset having a largest difference between the first sum and the second sum; and outputting the first rectangular area of the difference-maximizing area dataset as a numerical area of the table, the second rectangular area of the difference-maximizing area dataset as a column header area of the table, and the third rectangular area of the difference-maximizing area dataset as a column header area of the table. 2. The non-transitory computer-readable storage medium according to claim 1 , wherein the generating includes: calculating, with respect to individual candidates for a second rectangular area associated with a determined first rectangular area, a sum of first constant values and second constant values in cells, and selecting a candidate for a second rectangular area whose sum of first and second constant values has a same sign as the second constant value and exhibits a largest absolute value of all the candidates for a second rectangular area; calculating, with respect to individual candidates for a third rectangular area associated with a determined first rectangular area, a sum of first constant values and second constant values in cells thereof, and selecting a candidate for a third rectangular area whose sum of first and second constant values has a same sign as the second constant value and exhibits a largest absolute value of all the candidates for a third rectangular area; and forming an area dataset from the determined first rectangular area, the selected candidate for a second rectangular area, and the selected candidate for a third rectangular area. 3. The non-transitory computer-readable storage medium according to claim 1 , wherein: the replacing includes assigning a third constant value to empty cells in the table; and the calculating includes calculating a difference between a sum of first, second, and third constant values in cells of the first rectangular area and a sum of first, second, and third constant values in cells of the second rectangular area and in cells of the third rectangular area. 4. A method for extracting areas, comprising: obtaining a table dataset representing a table and storing the obtained table dataset in memory circuitry in a computer, the table having a plurality of cells arranged in rectangular form, at least some of the plurality of cells having numerical values or character strings; replacing, by processor circuitry in the computer, each of the numerical values in the table with a first constant value, and each of the character strings in the table with a second constant value whose sign is opposite to a sign of the first constant value; generating, by the processor circuitry, a plurality of area datasets each including first to third rectangular areas mapped on the plurality of cells, the first rectangular area having a top side and a left side, the second rectangular area having a right side that coincides with the left side of the first rectangular area, the third rectangular area having a bottom side that coincides with the top side of the first rectangular area, the plurality of area datasets respectively including different first rectangular areas; calculating, by the processor circuitry, a difference between a first sum and a second sum with respect to each of the generated area datasets, the first sum being a sum of first constant values and second constant values contained in cells of the first rectangular area, the second sum being a sum of first constant values and second constant values contained in cells of the second rectangular area and in cells of the third rectangular area; identifying, by the processor circuitry, a difference-maximizing area dataset among the plurality of area datasets, the difference-maximizing area dataset having a largest difference between the first sum and the second sum; and outputting, by the processor circuitry, the first rectangular area of the difference-maximizing area dataset as a numerical area of the table, the second rectangular area of the difference-maximizing area dataset as a column header area of the table, and the third rectangular area of the difference-maximizing area dataset as a column header area of the table. 5. An apparatus for extracting areas, comprising: memory circuitry that stores a table dataset representing a table, the table having a plurality of cells arranged in rectangular form, at least some of the plurality of cells having numerical values or character strings; and processor circuitry coupled to the memory circuitry and configured to perform a procedure including: reading the table dataset out of the memory circuitry; replacing each of the numerical values in the table with a first constant value, and each of the character strings in the table with a second constant value whose sign is opposite to a sign of the first constant value; generating a plurality of area datasets each including first to third rectangular areas mapped on the plurality of cells, the first rectangular area having a top side and a left side, the second rectangular area having a right side that coincides with the left side of the first rectangular area, the third rectangular area having a bottom side that coincides with the top side of the first rectangular area, the plurality of area datasets respectively including different first rectangular areas; calculating a difference between a first sum and a second sum with respect to each of the generated area datasets, the first sum being a sum of first constant values and second constant values contained in cells of the first rectangular area, the second sum being a sum of first constant values and second constant values contained in cells of the second rectangular area and in cells of the third rectangular area; identifying a difference-maximizing area dataset among the plurality of area datasets, the difference-maximizing area dataset having a largest difference between the first sum and the second sum; and outputting the first rectangular area of the difference-maximizing area dataset as a numerical area of the table, the second rectangular area of the differe

Assignees

Inventors

Classifications

  • of spreadsheets (form-filling G06F40/174) · CPC title

  • G06F40/177Primary

    of tables; using ruled lines · CPC title

  • Region-based segmentation · CPC title

  • Tabulation, i.e. one-dimensional [1D] positioning · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10255261B2 cover?
A processor obtains a table that contains numerical values or character strings in its cells. The processor then replaces each numerical value with a first constant value, and each character string with a second constant value. The two constant values have opposite signs. The processor generates area datasets each including first to third rectangular areas. The right side of the second rectangu…
Who is the assignee on this patent?
Fujitsu Ltd
What technology area does this patent fall under?
Primary CPC classification G06F40/177. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 09 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).