What technology area does this patent fall under?

Primary CPC classification G06F17/30117. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Nov 28 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method, device, node and system for managing file in distributed data warehouse

US9830327B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9830327-B2
Application number	US-201415033853-A
Country	US
Kind code	B2
Filing date	Nov 26, 2014
Priority date	Nov 29, 2013
Publication date	Nov 28, 2017
Grant date	Nov 28, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method, a device, a node and a system for managing file in distributed data warehouse are provided. The method includes: acquiring, by a data node, a deleting instruction carrying a data block identifier, wherein the deleting instruction is sent by a management node; suspending, by the data node, the deleting instruction; and deleting, by the data node, a data block corresponding to the data block identifier after a condition is met, thereby resolving the technical issue that an accidentally deleted file can not be recovered by setting a trash in the management node in some cases and ensuring the data security of the Hadoop system.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method for managing file in distributed data warehouse, comprising: acquiring, by a data node, a deleting instruction carrying a data block identifier, wherein the deleting instruction is sent by a management node; suspending, by the data node, the deleting instruction; and deleting, by the data node, a data block corresponding to the data block identifier after a condition is met; wherein the process of suspending, by the data node, the deleting instruction comprises storing the data block identifier into a delay queue; wherein the process of deleting, by the data node, the data block corresponding to the data block identifier after the condition is met comprises: deleting, by the data node, data blocks corresponding to all the data block identifiers in the delay queue in response to an emptying instruction sent by a client for emptying the data blocks corresponding to all the data block identifiers in the delay queue wherein before deleting the data blocks corresponding to all the data block identifiers in the delay queue in response to an emptying instruction sent by a client for emptying the data blocks corresponding to all the data block identifiers in the delay queue, the method further comprises: determining data blocks in the data node corresponding to all the data block identifiers in the delay queue; calculating a parameter of occupation of the determined data blocks in the data node; and sending the parameter of occupation to the management node, wherein the client determines after checking the parameter of occupation whether to send to the data node the emptying instruction for emptying the data blocks corresponding to all the data block identifiers in the delay queue; wherein the parameter of occupation comprises a delay deleting storage space and a delay deleting percentage, the process of calculating a parameter of occupation of the determined data blocks in the data node comprises: calculating a storage space occupied by the determined data blocks in the data node as the delay deleting storage space; and calculating a percentage of an entire storage space of the data node occupied by the delay deleting storage space as the delay deleting percentage. 2. The method according to claim 1 , wherein the process of deleting, by the data node, a data block corresponding to the data block identifier after a condition is met comprises: deleting, by the data node, the data block corresponding to the data block identifier in a case that a period since the data block identifier is stored into the delay queue reaches a predetermined time threshold. 3. The method according to claim 1 , wherein after storing the data block identifier into the delay queue, the method further comprises: receiving a recovering instruction sent by the management node for recovering the data block corresponding to the data block identifier stored in the delay queue; and sending to the management node a report carrying data block identifiers of all the data blocks stored in the data node, so that the management node creates a mapping from the data block identifier to the data node based on the data block identifiers in the received report. 4. The method according to claim 2 , wherein the method further comprises: receiving a time configuration instruction carrying a specified time length sent by the client, wherein the time configuration instruction is utilized in dynamic configuration of the predetermined time threshold; and updating the predetermined time threshold to the specified time length based on the time configuration instruction. 5. The method according to claim 3 , wherein the method further comprises: receiving a time configuration instruction carrying a specified time length sent by the client, wherein the time configuration instruction is utilized in dynamic configuration of the predetermined time threshold; and updating the predetermined time threshold to the specified time length based on the time configuration instruction. 6. A method for managing file in distributed data warehouse, comprising: receiving from a client an instruction for deleting a specified file; determining, by a management node, a data block which belongs to the specified file and is stored in a data node; sending to the data node, by the management node, a deleting instruction carrying a data block identifier of the data block, wherein the deleting instruction is suspended by the data node, until the data node deletes the data block corresponding to the data block identifier after a condition is met; receiving a file recovering instruction sent by the client for recovering the specified file; recovering an eligible first correspondence relation, wherein the eligible first correspondence relation is a first correspondence relation which is backed-up before the deleting instruction is sent and a time point for backup is closest to a time point of sending the deleting instruction, and the first correspondence relation comprises a relation between the specified file and a data block identifier of a data block in the specified file; and recovering a second correspondence relation, wherein the second correspondence relation is a mapping from the data block identifier of the data block to the data node storing the data block; wherein the method further comprises: sending to the data node an emptying instruction for emptying the data blocks corresponding to all the data block identifiers in the delay queue; and receiving a parameter of occupation sent by the data node, wherein the parameter of occupation comprises a delay deleting storage space and a delay deleting percentage, wherein the delay deleting storage space is a storage space occupied by the data blocks corresponding to all the data block identifiers in the delay queue of the data node, and the delay deleting percentage is a percentage of the entire storage space of the data node occupied by the delay deleting storage space, such that the client determines after checking the parameter of occupation whether to send to the data node an emptying instruction for emptying the data blocks corresponding to all the data block identifiers in the delay queue. 7. The method according to claim 6 , wherein the data block identifier is stored into a delay queue by the data node, wherein the process of recovering the second correspondence relation comprises: sending to the data node a recovering instruction for recovering the data block corresponding to the data block identifier stored in the delay queue, wherein the data node sends to the management node a report carrying data block identifiers of all the data blocks stored in the data node after receiving the recovering instruction; receiving the report sent by the data node; and creating a mapping from the data block identifier to the data node based on the data block identifiers in the received report. 8. A method for managing file in distributed data warehouse, wherein the method comprises: sending to a management node an instruction for deleting a specified file, wherein the instruction for deleting the specified file is utilized by the management node to determine a data block which belongs to the specified file and is stored in a data node, and the management node sends to the data node a deleting instruction carrying a data block identifier of the data block, wherein the deleting instruction is suspended by the data node, until the data node deletes the data block corresponding to the data block identifier after a condition is met; wherein the data block identifier is stored into a delay queue by the data node, the method further comprises: checking a parameter of occupation sent to the management node by each data node, wherei

Assignees

Tencent Tech Shenzhen Co Ltd

Inventors

Classifications

G06F11/14
Error detection or correction of the data by redundancy in operations (error detection or correction of the data by redundancy in hardware G06F11/16) · CPC title
G06F17/30117Primary
Physics · mapped topic
G06F17/30138
Physics · mapped topic
G06F11/1435
using file system or storage system metadata · CPC title
G06F16/1727
Details of free space management performed by the file system (saving storage space on storage systems G06F3/0608; management of blocks in storage devices G06F3/064) · CPC title

Patent family

Related publications grouped by family.

View patent family 53198370

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9830327B2 cover?: A method, a device, a node and a system for managing file in distributed data warehouse are provided. The method includes: acquiring, by a data node, a deleting instruction carrying a data block identifier, wherein the deleting instruction is sent by a management node; suspending, by the data node, the deleting instruction; and deleting, by the data node, a data block corresponding to the data …
Who is the assignee on this patent?: Tencent Tech Shenzhen Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06F17/30117. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Nov 28 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).