Inline Wire Speed Deduplication System
US-2016306853-A1 · Oct 20, 2016 · US
US2016283165A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016283165-A1 |
| Application number | US-201415033265-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jun 30, 2014 |
| Priority date | Nov 8, 2013 |
| Publication date | Sep 29, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Ingest data for virtual volumes (V) is split into segments (B 1, B 2, B 3, B 4 ) of a size that can be buffered in main memory. Data deduplication processing then occurs directly on the segments (B 1, B 2, B 3, B 4 ) in main memory, without the need for disk I/O.
Opening claim text (preview).
1 .- 45 . (canceled) 16 . A storage appliance comprising: an interface that provides access to at least one virtual tape drive for at least one host computer; a first database that stores metadata about virtual tape volumes received via the interface, at least one volatile main memory; and a deduplication engine that deduplicates storage objects; wherein each virtual tape volume is represented within the storage appliance as an ordered set of segments of data and the storage appliance is configured to perform the following steps: providing, from the volatile memory, at least one buffer memory to store a segment of data; receiving a data stream representing a virtual tape volume of the at least one virtual tape drive at the first interface; filling the buffer memory with data from the received data stream until the received data stream is closed, and a synchronization point of the virtual tape volume is identified in the received data stream or a predefined amount of data has been stored in the buffer memory; deduplicating with the deduplication engine the segment of data stored in the at least one buffer memory; and storing the deduplicated segment of data in a non-volatile storage device. 17 . The storage appliance according to claim 16 , wherein the metadata stored in the first database comprises mappings of segments of each virtual tape volume to tags of a binary large object, BLOB, corresponding to the deduplicated segment of data. 18 . The storage appliance according to claim 16 , wherein data received from the interface is appended to an existing virtual tape volume by: creating additional segments for the data to be appended; deduplicating and storing the additional segments, and adding indices of the additional deduplicated segments to the end of the ordered set. 19 . The storage appliance according to claim 16 , wherein data of an existing virtual tape volume is modified by: identifying an index of a first segment of data to be modified; reading and re-duplicating the deduplicated first segment from the deduplication engine into a memory buffer; invalidating all segments of data having an index equal to or greater than the identified index and deleting the corresponding indices from the ordered set; creating a new segment of data based on the buffered first segment and a modification request received from the interface; deduplicating and storing the new segment, and adding the index of the new segment to the end of the ordered set. 20 . The storage appliance according to claim 16 , wherein data of an existing virtual tape volume is read from a predetermined position of the virtual tape volume by: identifying an index of a first segment of data to be read; reading and re-duplicating the first segment by the deduplication engine into a memory buffer without previously re-duplicating any segment having a lower index than the first segment; and providing the data from the predetermined position from the buffer memory via the interface. 21 . The storage appliance according to claim 16 , wherein the virtual tape volume of the at least one virtual tape drive has a capacity exceeding the predefined amount of data. 22 . The storage appliance according to claim 21 , wherein the virtual tape volume of the at least one virtual tape drive has a capacity of at least 200 GB and the predefined amount of data is significantly smaller than 200 GB. 23 . The storage appliance according to claim 16 , wherein the storage appliance is configured to process a plurality of virtual tape volumes of the at least one virtual tape received in parallel, and, for each one of the plurality of virtual tape volumes received in parallel, a separate buffer memory that stores a segment of data of the respective virtual tape volume is provided. 24 . The storage appliance according to claim 16 , wherein the storage appliance is configured to provide, from the volatile memory, a plurality of buffer memories, each buffer memory storing a segment of data and, wherein, while deduplication and/or storing of a segment of data received from the data stream and stored in a first buffer memory is performed, at least one second buffer memory is filled with data of a subsequent segment from the received data stream. 25 . The storage appliance according to claim 24 , wherein a received file mark is only acknowledged to the application after all segments of data preceding the file mark have been successfully processed and stored by the deduplication engine. 26 . The storage appliance according to claim 16 , wherein, if a first segment of data is requested via the interface, the first segment of data is read and re-duplicated by the deduplication engine and provided to the interface and, before a subsequent request for a subsequent second segment of data is received, the second segment of data is read and re-duplicated by the deduplication engine and stored in the at least one memory buffer. 27 . The storage appliance according to claim 26 , wherein, on receipt of the subsequent request for the second segment of data, the second segment of data is provided from the at least one buffer memory to the interface. 28 . The storage appliance according to claim 16 , further comprising at least one Integrated Channel Processor that provides access to the at least one virtual tape drive for the at least one host computer, wherein the Integrated Channel Processor and the deduplication engine exchange data through at least one shared memory buffer. 29 . The storage appliance according to claim 28 , wherein, on mounting a virtual tape volume, the storage appliance determines, based on configuration information, whether the virtual tape volume comprises deduplicated segments of data, and, if the virtual tape volume comprises deduplicated segments of data, the deduplication engine is assigned to the Integrated Channel Processor to handle input/output requests. 30 . The storage appliance according to claim 16 , further comprising at least one Integrated Channel Processor that runs a de-duplication client and at least one Integrated Device Processor that runs a deduplication server. 31 . A method of segmenting a virtual tape volume provided by an application via an interface to a storage virtualization appliance, the storage virtualization appliance comprising a deduplication engine and at least one memory buffer that stores a segment of the virtual tape volume, wherein the segmentation of the virtual tape volume is based on a tape semantic of the interface of the storage virtualization appliance and is performed by repeating the following steps: filling the memory buffer with data of the virtual tape volume received from the application until a segment of the virtual tape volume is considered complete, wherein the segment is considered complete when the memory buffer has been filled to a predefined maximum segment size when the application writes a file mark serving as an synchronization point or when the virtual tape volume is closed by the application; passing a segment of the virtual tape volume for storage to the deduplication engine when the segment is considered complete, and processing, by the deduplication engine, the passed segment of the virtual tape volume directly from the memory buffer for deduplicated storage. 32 . The method according to claim 31 , wherein the storage virtualization appliance comprises a first database for storing metadata about virtual tape volumes received via the interface, and wherein each virtual tape volume is represented within the s
Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title
Data buffering arrangements · CPC title
at device level, e.g. emulation of a storage device or system · CPC title
Improving I/O performance · CPC title
Hybrid storage combining heterogeneous device types, e.g. hierarchical storage, hybrid arrays · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.