What technology area does this patent fall under?

Primary CPC classification G06F16/162. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 11 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Data management method and data analysis system

US11221986B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11221986-B2
Application number	US-201716332775-A
Country	US
Kind code	B2
Filing date	May 31, 2017
Priority date	May 31, 2017
Publication date	Jan 11, 2022
Grant date	Jan 11, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Provided is a data management method capable of deleting intermediate data at an appropriate timing. The data management method in a data analysis system that performs analysis by combining a plurality of input data based on an analysis execution request from a computer includes: a first step, in which a request analysis unit analyzes the analysis execution request from the computer to identify a task, identifies intermediate data generated after execution of each identified task, and generates constraint information that determines whether to delete the identified intermediate data; a second step, in which a task management unit determines whether to delete the intermediate data based on the constraint information for each identified task; and a third step, in which a task execution unit executes the identified task and deletes the intermediate data of the task based on a determination result of the second step.

First claim

Opening claim text (preview).

The invention claimed is: 1. A data management method in a data analysis system that performs analysis by combining a plurality of input data based on an analysis execution request from a computer, comprising: a first step, in which a request analysis unit: analyzes the analysis execution request from the computer to identify a plurality of tasks, identifies intermediate data generated by execution of each identified task, and identifies attribute information included in each of the identified intermediate data, and records the identified plurality of tasks, identified intermediate data and identified attribute information in a task management data structure, wherein each of the identified plurality of task are associate with a respective identified intermediate data and respective attribute information; identifies deletion target information data structure which defines blacklist information to be deleted; compares the attribute information for each identified intermediate data of the task management data structure with the blacklist information of the deletion target data structure, and determines whether a number of pieces of attribute information of each of the identified intermediate data is equal to or greater than a threshold number of pieces of the blacklist information; generates constraint information that determines whether to delete each of the identified intermediate data based on the comparison, the constraint information comprising countermeasure information for each of the identified intermediate data, wherein, for each of the identified intermediate data: the countermeasure information is set to delete in response to the number of pieces of attribute information of the respective identified intermediate data being equal to or greater than the threshold number of pieces of the blacklist information; and the countermeasure information is set to leave in response to the number of pieces of attribute information of the respective identified intermediate data being less than the threshold number of pieces of the blacklist information; a second step, in which a task management unit determines whether to delete or leave each of the identified intermediate data from the constraint information for each identified task; and a third step, in which a task execution unit executes each of the identified tasks and deletes or leaves intermediate data of each task based on a determination result of the second step. 2. The data management method according to claim 1 , wherein the task management unit generates a flow for the task execution unit to execute the plurality of tasks, the flow being generated in a following way, for each task of the plurality of tasks: a task is added to the flow, and then when it is determined, based on the constraint information, that intermediate data generated by execution of the added task includes a number of pieces of the attribute information equal to or greater than the threshold number of pieces of attribute information, a deletion task, for deleting the intermediate data, is added to the flow to be sequentially performed following the added task, and the task execution unit executes the plurality of tasks in accordance with the generated flow. 3. The data management method according to claim 1 , wherein the request analysis unit includes, in the constraint information, information indicating a task in which identified intermediate data are finally used, and the task management unit determines to delete the intermediate data when it is determined that the intermediate data generated after execution of an identified task are finally used based on the constraint information. 4. The data management method according to claim 3 , wherein the task management unit generates a flow for the task execution unit to execute the plurality of tasks, the flow being generated in a following way, for each of the plurality of tasks: a task is added to the flow, and then when it is determined, based on the constraint information, that intermediate data generated after execution of the added task are finally used, a deletion task, for deleting the intermediate data generated by execution of the added task, is added to the flow to be sequentially performed following the added task, and the task execution unit executes the tasks in accordance with the generated flow. 5. The data management method according to claim 1 , wherein the request analysis unit includes, in the constraint information, an analysis result of whether a generation time associated with identified intermediate data is shorter than a predetermined threshold value, and the task management unit determines to delete the intermediate data, when it is determined, based on the constraint information, that the generation time associated with the identified intermediate data is shorter than the threshold value. 6. The data management method according to claim 5 , wherein the task management unit generates a flow for the task execution unit to execute the plurality of tasks, the flow being generated in a following way, for each of the plurality of tasks: a task is added to the flow, and then when it is determined, based on the constraint information, that the generation time associated with intermediate data generated after execution of the added task is shorter than the threshold value, a deletion task, for deleting the intermediate data generated by execution of the added task, is added to the flow to be sequentially performed following the added task, and the task execution unit executes the tasks in accordance with the flow. 7. The data management method according to claim 1 , further comprising: a fourth step, in which the task execution unit deletes all intermediate data which are not deleted after execution of all the tasks of the analysis execution request; a fifth step, in which the task execution unit transmits a result of executing all the tasks of the analysis execution request to the computer as an execution result of the analysis execution request; and a sixth step, in which the computer outputs the received execution result of the analysis execution request. 8. A data analysis system that performs analysis by combining a plurality of input data based on an analysis execution request from a computer, comprising: a storage device configured to store a program; and a central processing unit (CPU) configured to execute the program stored in the storage device to: analyze the analysis execution request from the computer to identify a plurality of tasks; identify intermediate data generated after execution of each identified task; identify attribute information included in each of the identified intermediate data; record the identified plurality of tasks, identified intermediate data and identified attribute information in a task management data structure, wherein each of the identified plurality of task are associate with a respective identified intermediate data and respective attribute information; identify deletion target information which defines blacklist information to be deleted, compare the attribute information for each identified intermediate data of the task management data structure with the blacklist information of the deletion target data structure, and determine whether a number of pieces of attribute information of each of the identified intermediate data is equal to or greater than a threshold number of pieces of the blacklist information; generate constraint information that determines whether to delete each of the identified intermediate data based on the comparison, the constrain information comprising countermeasure information for each of the identified intermediate data; wherein, for each of the identified interm

Assignees

Hitachi Ltd

Inventors

Classifications

G06F21/60
Protecting data · CPC title
G06F16/162Primary
Delete operations (erasing in storage systems G06F3/0652) · CPC title
G06F16/221
Column-oriented storage; Management thereof · CPC title
G06F9/4843
by program, e.g. task dispatcher, supervisor, operating system · CPC title
G06F21/62
Protecting access to data via a platform, e.g. using keys or access control rules · CPC title

Patent family

Related publications grouped by family.

View patent family 64454543

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11221986B2 cover?: Provided is a data management method capable of deleting intermediate data at an appropriate timing. The data management method in a data analysis system that performs analysis by combining a plurality of input data based on an analysis execution request from a computer includes: a first step, in which a request analysis unit analyzes the analysis execution request from the computer to identify…
Who is the assignee on this patent?: Hitachi Ltd
What technology area does this patent fall under?: Primary CPC classification G06F16/162. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 11 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).