Focused sanitization process for deduplicated storage systems
US-10649682-B1 · May 12, 2020 · US
US11120147B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11120147-B2 |
| Application number | US-201816127503-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 11, 2018 |
| Priority date | Sep 11, 2018 |
| Publication date | Sep 14, 2021 |
| Grant date | Sep 14, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computerized operating system begins a garbage-collection operation by collecting a set of “garbage” data objects to be deleted. Certain of these objects are identified, either by an embedded identifier or by an entry in a sensitive-objects data structure, as containing sensitive data. When the garbage collector moves or deletes a sensitive object during the garbage-collection procedure, the collector zeroes out any residual data left at the object's original location in memory or secondary storage. If the collector determines that the object no longer has any connection to other software entities, the collector zeroes out the storage locations of all identified instances of the object. The collector then updates the data structure to indicate the current location of sensitive objects that have been moved or copied, and deletes entries for zeroed out instances of deleted sensitive objects.
Opening claim text (preview).
What is claimed is: 1. A computerized operating system comprising a processor, a memory coupled to the processor, and a computer-readable hardware storage device coupled to the processor, the storage device containing program code configured to be run by the processor via the memory to implement a method for garbage collection with integrated clearing of sensitive data, the method comprising: directing, by the processor, a garbage-collection component of the operating system to initiate a garbage-collection (GC) operation, where the garbage-collection component comprises: a set of data structures that each identify that a corresponding stored instance of a data object contains sensitive data, a sensitivity-monitoring module that automatically updates the set of data structures when the operating system determines that a data object accessible by the operating system has begun to store sensitive data, where the updating revises the set of data structures to identify that each instance of the accessible data object stores sensitive data, and a sensitivity-aware garbage-collector module that, during a GC operation: removes, from locations on non-transitory storage devices managed by the operating system, unneeded instances of data objects, where an instance is deemed to be unneeded if the instance is in use by neither the operating system nor by any application managed by the operating system, and automatically sanitizes locations of any of the unneeded instances that are identified by the set of data structures as storing sensitive data, where the sanitizing a first location of a first unneeded instance comprises overwriting, by the processor, all data comprised by the first unneeded instance and stored at the first location, such that the overwritten data can no longer be accessed by the operating system, where a first data structure of the set of data structures comprises a first object identifier of a first instance of a first data object, where the first instance is accessible by the operating system, where the first data object contains sensitive data and a first storage identifier that identifies a location at which the first instance is stored, where the first data structure is a tree structure, where a first node of a first branch of the tree structure contains the first object identifier and the first storage identifier, where each descendant node of the first node contains object and storage identifiers of an instance of an object that is referenced by the first data object, where each ancestor node of the first node contains object and storage identifiers of an instance of an object that references the first data object, and where the sanitizing the first instance further comprises sanitizing, by the processor, all storage locations identified by any descendant node of the first node or by any ancestor node of the first node. 2. The system of claim 1 , where the GC operation further comprises: determining, by the processor, that the GC operation has moved a first unneeded object identified by the set of data structures as containing sensitive data, from an original location to a new location; sanitizing, by the processor, the original location; and updating the set of data structures, by the processor, to change a storage identifier of the first unneeded object from the original location to the new location. 3. The system of claim 1 , where the GC operation further comprises: determining, by the processor, that the GC operation has deleted from a first location an unneeded instance identified by the set of data structures as containing sensitive data; sanitizing, by the processor, the first location; and deleting from the set of data structures, by the processor, a data structure that associates the unneeded instance with the first location. 4. The system of claim 1 , where the GC operation further comprises: determining, by the processor, that a first unneeded object identified by the set of data structures is no longer referenced by any software entity managed by the operating system and no longer references any software entity managed by the operating system; sanitizing, by the processor, all storage locations identified by the set of data structures as storing an instance of the first unneeded object; and deleting from the set of data structures, by the processor, all data structures that identify an instance of the first unneeded object. 5. The system of claim 1 , where a first data structure of the set of data structures comprises: a first sensitivity indicator embedded into metadata of a first instance of a first data object, where the first sensitivity indicator indicates to the operating system that the first instance contains sensitive data. 6. The system of claim 1 , where the sanitizing a first location of a first unneeded instance comprises overwriting, by the processor, all data comprised by the first unneeded instance and stored at the first location, such that the overwritten data can no longer be accessed by the operating system. 7. The system of claim 1 , where an instance of a data object is stored in the memory. 8. The system of claim 1 , where an instance of a data object is stored on a non-transitory secondary storage device that is managed by the operating system. 9. The system of claim 1 , where a data object is deemed to be sensitive if the data object comprises data that falls into any of a set of predefined categories deemed by the operating system to require heightened security measures. 10. A method for garbage collection with integrated clearing of sensitive data, the method comprising: a computerized operating system directing a garbage-collection component of the operating system to initiate a garbage-collection (GC) operation, where the garbage-collection component comprises a set of data structures that each identify that a corresponding stored instance of a data object contains sensitive data, where a first data structure of the set of data structures comprises a first object identifier that identifies a first instance of a first data object that contains sensitive data, and a first storage identifier that identifies a location at which the first instance is stored, a sensitivity-monitoring module that automatically updates the set of data structures when the operating system determines that a data object accessible by the operating system has begun to store sensitive data, where the updating revises the set of data structures to identify that each instance of the accessible data object stores sensitive data, and a sensitivity-aware garbage-collector module that, during a GC operation: removes, from locations on non-transitory storage devices managed by the operating system, unneeded instances of data objects, where an instance is deemed to be unneeded if the instance is in use by neither the operating system nor by any application managed by the operating system, and automatically sanitizes locations of any of the unneeded instances that are identified by the set of data structures as storing sensitive data, where the sanitizing a first location of a first unneeded instance comprises overwriting, by the processor, all data comprised by the first unneeded instance and stored at the first location, such that the overwritten data can no longer be accessed by the operating system, where a first data structure of the set of data structures comprises a first object identifier of a first instance of a first data object, where the first instance is accessible by the operating system, where the first data object contains sensitive data and a first storage identifier that identifies a location at which the first instance is sto
by anonymising data, e.g. decorrelating personal data from the owner's identification · CPC title
Space efficiency improvement · CPC title
for a range · CPC title
Protecting personal data, e.g. for financial or medical purposes · CPC title
Trees, e.g. B+trees · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.