Maintaining a deduplication database
US-2015261792-A1 · Sep 17, 2015 · US
US2016019254A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016019254-A1 |
| Application number | US-201414333391-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jul 16, 2014 |
| Priority date | Jul 16, 2014 |
| Publication date | Jan 21, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The disclosure is directed to storing data in different tiers of a database based on the access pattern of the data. Immutable data, e.g., data that does not change or changes less often than a specified threshold, is stored in a first storage tier of the database, and mutable data, e.g., data that changes more often than immutable data, is stored in a second storage tier of the database. The second storage tier of the database is more performant than the first storage tier, e.g., the second storage tier has a higher write endurance and a lower write latency than the first storage tier. All writes to the database are performed at the second storage tier and reads on both storage tiers. The storage tiers are synchronized, e.g., the set of data is copied from the second to the first storage tier based on a trigger, e.g., a specified schedule.
Opening claim text (preview).
I/We claim: 1 . A method performed by a computing system, comprising: identifying a first set of multiple data items in a database whose update frequency exceeds a specified threshold; identifying a second set of the data items in the database whose update frequency is below the specified threshold; storing the first set in a first persistent storage layer of multiple persistent storage layers; and storing the second set in a second persistent storage layer of the persistent storage layers. 2 . The method of claim 1 further comprising: receiving a database access request at the computing system; executing the database access request at the first persistent storage layer if the database access request is a write request; and executing the database access request at the first persistent storage layer and the second persistent storage layer if the database access request is a read request. 3 . The method of claim 1 further comprising: receiving a write request at the computing system to write a group of data items to the database, the group of data items including at least one of (a) one or more data items not stored at the database or (b) updates to a subset of the data items stored at the database; and storing, in response to the write request, the group of data items at the first persistent storage layer. 4 . The method of claim 3 further comprising: flushing, in response to a trigger condition, the group of data items from the first persistent storage layer to the second persistent storage layer. 5 . The method of claim 4 , wherein the flushing the group of data items to the second persistent storage layer includes at least one of: writing the one or more data items at the second persistent storage layer, or merging the updates with the corresponding subset of the data items stored at the second persistent storage layer. 6 . The method of claim 1 further comprising: receiving a read request at the computing system to obtain a data item from the database; executing the read request at the first persistent storage layer and the second persistent storage layer; obtaining the data item from at least one of the first persistent storage layer and the second persistent storage layer; and transmitting the data item in response to the read request. 7 . The method of claim 6 , wherein obtaining the data item from at least one of the first persistent storage layer and the second persistent storage layer includes: obtaining a first version of the data item from the first persistent storage layer, obtaining a second version of the data item from the second persistent storage layer, the first version being an update to the second version of the data item, and merging, based on a specified function, the first version of the data item and the second version of the data item to generate the data item. 8 . The method of claim 6 , wherein the read request is executed at the first persistent storage layer and the second persistent storage layer in parallel. 9 . The method of claim 1 , wherein identifying the first set of the data items and the second set of the data items incudes analyzing an access pattern of the data items stored at the database to determine an update frequency of each of the data items. 10 . The method of claim 1 , wherein the first persistent storage layer includes a first set of storage devices and the second persistent storage layer includes a second set of storage devices, the first set of storage devices having at least one of a write latency lesser than that of the second set of storage devices or a higher write endurance than that of the second set of storage devices. 11 . The method of claim 1 , wherein identifying the first set of the data items whose update frequency exceeds the specified threshold includes: determining multiple update frequency levels, wherein each of the update frequency levels exceeds the specified threshold, and assigning each of the first set of the data items to an update frequency level of the update frequency levels based on the update frequency of the corresponding data item. 12 . The method of claim 11 , wherein the first persistent storage layer is further categorized into multiple sub-layers, each of the sub-layers corresponding to one of the update frequency levels. 13 . The method of claim 12 , wherein the sub-layers include a first sub-layer corresponding to a highest update frequency levels of the update frequency levels, and wherein the sub-layers include a last sub-layer corresponding to a lowest update frequency levels of the update frequency levels. 14 . The method of claim 13 , wherein the first sub-layer includes a first set of storage devices and the last sub-layer includes a second set of storage devices, the first set of storage devices having a write latency lesser than that of the second set of storage devices. 15 . A computer-readable storage medium storing computer-executable instructions, comprising: instructions for analyzing an access pattern of multiple data items stored at a database to determine an update frequency of each of the data items; instructions for assigning each of the data items to one of multiple update frequency levels based on the update frequency of the corresponding data item; instructions for assigning a first set of the update frequency levels to a first persistent storage layer of multiple persistent storage layers, the assigning including: storing a first set of the data items whose update frequency is within the first set of the update frequency levels at the first persistent storage layer; and instructions for assigning a second set of the update frequency levels to a second persistent storage layer of the persistent storage layers, the instructions for assigning including: instructions for storing a second set of the data items whose update frequency is within the second set of the update frequency levels at the second persistent storage layer, the first persistent storage layer having a lower write latency than the second persistent storage layer. 16 . The computer-readable storage medium of claim 15 , wherein the first set of the update frequency levels represent an update frequency of a specified data item that exceeds a specified threshold. 17 . The computer-readable storage medium of claim 15 , wherein the first persistent storage layer is further categorized into multiple sub-layers, each of the sub-layers corresponding to one or more of the first set of the update frequency levels and stores a subset of the first set of the data items corresponding to the one or more of the first set of the update frequency levels. 18 . A system, comprising: a processor; a first module configured to determine a first set of multiple data items whose update frequency exceeds a specified threshold and a second set of the data items whose update frequency is below the specified threshold, the data items stored in a database of the system, the database stored across multiple persistent storage layers of the system, the update frequency of a specified data item being a frequency at which the specified data item is modified; a second module configured to store the first set of the data items at a first persistent storage layer of the persistent storage layers and the second set of the data items at a second persistent storage layer of the persistent storage layers, the first persistent storage layer having a higher write endurance and a lower write latency than the second persistent storage layer; a third mod
Update request formulation · CPC title
Indexing; Web crawling techniques · CPC title
Clustering or classification · CPC title
Physics · mapped topic
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.