De-duplication with partitioning advice and automation
US-9213715-B2 · Dec 15, 2015 · US
US9703796B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9703796-B2 |
| Application number | US-201213677957-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 15, 2012 |
| Priority date | Dec 6, 2011 |
| Publication date | Jul 11, 2017 |
| Grant date | Jul 11, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In one embodiment, a system and method for managing a network deduplication dictionary is disclosed. According to the method, the dictionary is divided between available deduplication engines (DDE) in deduplication devices that support shared dictionaries. The fingerprints are distributed to different DDEs based on a hash function. The hash function takes the fingerprint and hashes it and based on the hash result, it selects one of the DDEs. The hash function could select a few bits from the fingerprint and use those bits to select a DDE.
Opening claim text (preview).
What is claimed is: 1. A data deduplication engine (DDE), comprising: a plurality of local area network (LAN) ports for coupling to a LAN, the LAN including a data device and a second DDE; at least one wide area network (WAN) port for coupling to a WAN, the WAN including a remote DDE; and an engine coupled to the plurality of LAN ports and to the at least one WAN port for receiving an input data stream containing duplicates from the data device, providing an output data stream containing duplicates to the data device, providing references and segments to the remote DDE and receiving references and segments from the remote DDE; wherein upon receiving the input data stream from the data device, the engine performs segmentation and fingerprint processing on the input data stream to create a first segment and a first fingerprint for the first segment; and wherein the engine determines if the engine or the second DDE owns the first fingerprint. 2. The data deduplication engine of claim 1 , wherein if the engine owns the first fingerprint, the engine determines whether the first fingerprint is for a new segment or an old segment. 3. The data deduplication engine of claim 2 , wherein if the engine owns the first fingerprint and the first segment is the new segment, the engine stores the first fingerprint and the first segment, determines a first reference and provides the first reference and the first segment to the remote DDE. 4. The data deduplication engine of claim 2 , wherein if the engine owns the first fingerprint and the first segment is the old segment, the engine looks up a reference for the first segment and provides the reference to the remote DDE. 5. The data deduplication engine of claim 1 , wherein if the second DDE owns the first fingerprint, the engine sends a message comprising the first fingerprint to the second DDE. 6. The data deduplication engine of claim 5 , wherein if the second DDE owns the first fingerprint and the first segment is an old segment, the engine receives a reference associated with the first fingerprint from the second DDE and provides the reference to the remote DDE. 7. The data deduplication engine of claim 5 , wherein if the second DDE owns the first fingerprint and the first segment is a new segment, the engine receives an indication of being the new segment from the second DDE and the engine sends the first segment to the second DDE. 8. The data deduplication engine of claim 7 , wherein the engine receives a reference associated with the first segment from the second DDE and the engine provides the reference to the remote DDE. 9. The data deduplication engine of claim 1 , wherein if the engine receives a message comprising a second fingerprint of a second segment from the second DDE, the engine provides a reference associated with the second fingerprint to the second DDE. 10. The data deduplication engine of claim 9 , wherein the engine determines if the second fingerprint is for a new segment or an old segment. 11. The data deduplication engine of claim 10 , wherein if the second fingerprint is for the new segment, the engine stores the second fingerprint and provides an indication of being the new segment to the second DDE. 12. The data deduplication engine of claim 11 , wherein the engine receives a second segment from the second DDE in response to the indication and stores the second segment. 13. The data deduplication engine of claim 10 , wherein if the second fingerprint is for the old segment, the engine looks up the reference associated with the second fingerprint to provide the reference to the second DDE. 14. The data deduplication engine of claim 1 , wherein when the engine receives a remote reference from the remote DDE, the engine determines if the engine or the second DDE owns the remote reference. 15. The data deduplication engine of claim 14 , wherein if the engine owns the remote reference, the engine stores the remote reference. 16. The data deduplication engine of claim 15 , wherein if the engine owns the remote reference and the engine receives a remote segment, the engine provides the remote segment as the output data stream to the data device. 17. The data deduplication engine of claim 15 , wherein if the engine does not receive a remote segment associated with the remote reference from the remote DDE and the engine owns the remote reference, the engine reconstructs a segment using the remote reference and provides the reconstructed segment as the output data stream to the data device. 18. The data deduplication engine of claim 14 , wherein if the second DDE owns the remote reference, the engine provides the remote reference to the second DDE. 19. The data deduplication engine of claim 18 , wherein if the second DDE owns the remote reference and the engine receives a remote segment, the engine provides the remote segment to the second DDE and provides the remote segment as the output data stream to the data device. 20. The data deduplication engine of claim 18 , wherein if the engine does not receive a remote segment associated with the remote reference from the remote DDE and the second DDE owns the remote reference, the engine receives a reconstructed segment from the second DDE and provides the reconstructed segment as the output data stream to the data device. 21. The data deduplication engine of claim 1 , wherein the engine is configured to apply a DDE mask to the first fingerprint to determine which deduplication engine owns the first fingerprint. 22. The data deduplication engine of claim 1 , wherein the engine is configured to apply a hash function to the first fingerprint to determine which deduplication engine owns the first fingerprint. 23. A method for inline data deduplication, comprising: receiving an input data stream containing duplicates from a data device, the input data stream being received by a first data deduplication engine (DDE); providing an output data stream containing duplicates from the first DDE to the data device; providing references and segments to a remote DDE and receiving references and segments at the first DDE from the remote DDE; performing segmentation and fingerprint processing by the first DDE on the input data stream to create a first segment and a first fingerprint for the first segment; and determining by the first DDE if the first DDE or a second DDE owns the first fingerprint. 24. The method of claim 23 , wherein if the first DDE owns the first fingerprint, the method further comprising the first DDE determining whether the first fingerprint is for a new segment or an old segment. 25. The method of claim 24 , wherein if the first DDE owns the first fingerprint and the first segment is the new segment, the method further comprising the first DDE storing the first fingerprint and the first segment. 26. The method of claim 25 , further comprising the first DDE determining a reference and providing the reference and the first segment to the remote DDE. 27. The method of claim 24 , wherein if the first DDE owns the first fingerprint and the first segment is the old segment, the method further comprising the first DDE looking up a reference for the first segment and providing the reference to the remote DDE. 28. The method of claim 23 , wherein if the second DDE owns the first fingerprint, the method further comprising the first DDE sending a message comprisin
Physics · mapped topic
Hash-based (content-based indexing of textual data G06F16/31) · CPC title
De-duplication implemented within the file system, e.g. based on file segments (de-duplication techniques in storage systems for the management of data blocks G06F3/0641) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.