What technology area does this patent fall under?

Primary CPC classification G06F12/0253. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue May 09 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Garbage collection predictions

US9646067B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9646067-B2
Application number	US-201414120368-A
Country	US
Kind code	B2
Filing date	May 14, 2014
Priority date	May 14, 2013
Publication date	May 9, 2017
Grant date	May 9, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Described herein are systems and methods for garbage collection prediction. A temporal graph is received, the temporal graph including nodes, the nodes including hash references to objects. An accumulated difference count is updated when a node is added to the temporal graph, the accumulated difference count including a number of hash differences between a parent node and its children nodes in the temporal graph. A divested difference count is updated when a node is removed from the temporal graph, the divested difference count including a number of hash differences referenced by the removed node but not by either a parent node of the removed node or any child nodes of the removed node. The outcome of the garbage collection is predicted based on at least one of the accumulated difference count and the divested difference count.

First claim

Opening claim text (preview).

What is claimed is: 1. A computerized method of maintaining running information of ingestion and deletion of file system data for a deduplicated data store in which to determine an expected outcome of a garbage collection operation on the deduplicated data store without performing the garbage collection operation, the method comprising: maintaining, by a computer device, a temporal graph, the temporal graph including nodes, the nodes including hash references to objects; updating, by the computer device, an accumulated difference count when a node is added to the temporal graph, the accumulated difference count including a number of hash differences between a parent node and its children nodes in the temporal graph; updating, by the computer device, a divested difference count when a node is removed from the temporal graph, the divested difference count including a number of hash differences referenced by the removed node but not referenced by either a parent node of the removed node or any child nodes of the removed node; and determining, by the computer device, an expected amount of data being reclaimable during a garbage collection operation based on both of the accumulated difference count and the divested difference count, thereby estimating an expected outcome of the garbage collection operation without performing the garbage collection operation, wherein determining the expected amount of data being reclaimable during the garbage collection includes determining an amount of data storage used by the deduplicated data store but not referenced by an object. 2. The method of claim 1 , wherein determining an amount of data storage used by the deduplicated data store but not referenced by an object includes determining a ratio of the accumulated difference count to the divested difference count. 3. The method of claim 1 , further comprising performing the garbage collection, by the computer device, when the amount of data storage used by the deduplicated data store but not referenced by an object exceeds a threshold value. 4. A computerized system for maintaining running information of ingestion and deletion of file system data for a deduplicated data store in which to determine an expected outcome of a garbage collection operation on the deduplicated data store without performing the garbage collection operation, comprising a processor configured to run a module stored in memory that is configured to cause the processor to: maintain a temporal graph, the temporal graph including nodes, the nodes including hash references to objects; update an accumulated difference count when a node is added to the temporal graph, the accumulated difference count including a number of hash differences between a parent node and its children nodes in the temporal graph; update a divested difference count when a node is removed from the temporal graph, the divested difference count including a number of hash differences referenced by the removed node but not by either a parent node of the removed node or any child nodes of the removed node; and determine an expected amount of data being reclaimable during a garbage collection operation based on both of the accumulated difference count and the divested difference count, thereby estimating an expected outcome of the garbage collection operation without performing the garbage collection operation, wherein determining the expected amount of data being reclaimable during the garbage collection includes determining an amount of data storage used by the deduplicated data store but not referenced by an object. 5. The computerized system of claim 4 , wherein determining an amount of data storage used by the deduplicated data store but not referenced by an object includes determining a ratio of the accumulated difference count to the divested difference count. 6. The method of claim 4 , wherein the module stored in memory is further configured to cause the processor to perform the garbage collection when the amount of data storage used by the deduplicated data store but not referenced by an object exceeds a threshold value. 7. A non-transitory computer readable medium having executable instructions operable to cause an apparatus to: maintain a temporal graph, the temporal graph including nodes, the nodes including hash references to objects; update an accumulated difference count when a node is added to the temporal graph, the accumulated difference count including a number of hash differences between a parent node and its children nodes in the temporal graph; update a divested difference count when a node is removed from the temporal graph, the divested difference count including a number of hash differences referenced by the removed node but not by either a parent node of the removed node or any child nodes of the removed node; and determine an expected amount of data being reclaimable during a garbage collection operation based on both of the accumulated difference count and the divested difference count, thereby estimating an expected outcome of the garbage collection operation without performing the garbage collection operation, wherein determining the expected amount of data being reclaimable during the garbage collection includes determining an amount of data storage used by the deduplicated data store but not referenced by an object. 8. The non-transitory computer readable medium of claim 7 , wherein determining an amount of data storage used by the deduplicated data store but not referenced by an object includes determining a ratio of the accumulated difference count to the divested difference count. 9. The non-transitory computer readable medium of claim 7 , wherein the executable instructions are further operable to cause an apparatus to perform the garbage collection when the amount of data storage used by the deduplicated data store but not referenced by an object exceeds a threshold value.

Assignees

Actifio Inc

Inventors

Classifications

G06F11/1448
Management of the data involved in backup or backup restore · CPC title
G06F16/1727
Details of free space management performed by the file system (saving storage space on storage systems G06F3/0608; management of blocks in storage devices G06F3/064) · CPC title
G06F12/0253Primary
Garbage collection, i.e. reclamation of unreferenced memory · CPC title
G06F16/1752
based on file chunks · CPC title
G06F16/137
Hash-based (content-based indexing of textual data G06F16/31) · CPC title

Patent family

Related publications grouped by family.

View patent family 50977039

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9646067B2 cover?: Described herein are systems and methods for garbage collection prediction. A temporal graph is received, the temporal graph including nodes, the nodes including hash references to objects. An accumulated difference count is updated when a node is added to the temporal graph, the accumulated difference count including a number of hash differences between a parent node and its children nodes in …
Who is the assignee on this patent?: Actifio Inc
What technology area does this patent fall under?: Primary CPC classification G06F12/0253. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue May 09 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).