Maintaining a deduplication database
US-2015261792-A1 · Sep 17, 2015 · US
US9697247B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9697247-B2 |
| Application number | US-201414333391-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 16, 2014 |
| Priority date | Jul 16, 2014 |
| Publication date | Jul 4, 2017 |
| Grant date | Jul 4, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The disclosure is directed to storing data in different tiers of a database based on the access pattern of the data. Immutable data, e.g., data that does not change or changes less often than a specified threshold, is stored in a first storage tier of the database, and mutable data, e.g., data that changes more often than immutable data, is stored in a second storage tier of the database. The second storage tier of the database is more performant than the first storage tier, e.g., the second storage tier has a higher write endurance and a lower write latency than the first storage tier. All writes to the database are performed at the second storage tier and reads on both storage tiers. The storage tiers are synchronized, e.g., the set of data is copied from the second to the first storage tier based on a trigger, e.g., a specified schedule.
Opening claim text (preview).
We claim: 1. A method performed by a computing system, comprising: identifying a first set of multiple data items in a database whose update frequency exceeds a specified threshold, the update frequency of a data item of the multiple data items being a frequency at which the data item is modified; identifying a second set of the multiple data items in the database whose update frequency is below the specified threshold; storing the first set in a first persistent storage layer of multiple persistent storage layers; storing the second set in a second persistent storage layer of the multiple persistent storage layers, the first persistent storage layer having a higher write endurance and lower write latency than the second persistent storage layer; receiving a read request to obtain a specified data item from the database; and executing the read request at both the first persistent storage layer and the second persistent storage layer to obtain the specified data item, wherein executing the read request includes: obtaining the specified data item from the second persistent storage layer, and an update that is to be applied to the specified data item from the first persistent storage layer, merging, based on a merge function, the update with the specified data item to generate an updated specified data item, and returning the updated specified data item as the specified data item. 2. The method of claim 1 further comprising: receiving a database access request at the computing system; executing the database access request at the first persistent storage layer in an event the database access request is a write request; and executing the database access request at the first persistent storage layer and the second persistent storage layer in an event the database access request is a read request. 3. The method of claim 1 further comprising: receiving a write request at the computing system to write a group of data items to the database, the group of data items including at least one of (a) one or more data items not stored at the database or (b) updates to a subset of the multiple data items stored at the database; and storing, in response to the write request, the group of data items at the first persistent storage layer. 4. The method of claim 3 further comprising: flushing, in response to a trigger condition, the group of data items from the first persistent storage layer to the second persistent storage layer. 5. The method of claim 4 , wherein the flushing the group of data items to the second persistent storage layer includes at least one of: writing the one or more data items at the second persistent storage layer, or merging the updates with the corresponding subset of the multiple data items stored at the second persistent storage layer. 6. The method of claim 1 further comprising: transmitting the specified data item in response to the read request. 7. The method of claim 1 , wherein the read request is executed at the first persistent storage layer and the second persistent storage layer in parallel. 8. The method of claim 1 , wherein identifying the first set of the multiple data items and the second set of the multiple data items includes analyzing an access pattern of the multiple data items stored at the database to determine an update frequency of each of the multiple data items. 9. The method of claim 1 , wherein the first persistent storage layer includes a first set of storage devices and the second persistent storage layer includes a second set of storage devices, the first set of storage devices having at least one of a write latency lesser than that of the second set of storage devices or a higher write endurance than that of the second set of storage devices. 10. The method of claim 1 , wherein identifying the first set of the multiple data items whose update frequency exceeds the specified threshold includes: determining multiple update frequency levels, wherein each of the update frequency levels exceeds the specified threshold, and assigning each of the first set of the multiple data items to an update frequency level of the update frequency levels based on the update frequency of the corresponding data item. 11. The method of claim 10 , wherein the first persistent storage layer is further categorized into multiple sub-layers, each of the sub-layers corresponding to one of the update frequency levels. 12. The method of claim 11 , wherein the sub-layers include a first sub-layer corresponding to a highest update frequency levels of the update frequency levels, and wherein the sub-layers include a last sub-layer corresponding to a lowest update frequency levels of the update frequency levels. 13. The method of claim 12 , wherein the first sub-layer includes a first set of storage devices and the last sub-layer includes a second set of storage devices, the first set of storage devices having a write latency lesser than that of the second set of storage devices. 14. A non-transitory computer-readable storage medium storing computer-executable instructions, which, when executed by a computer, performs a method comprising: analyzing an access pattern of multiple data items stored at a database to determine an update frequency of each of the multiple data items, the update frequency of a data item of the multiple data items being a frequency at which the data item is modified; assigning each of the multiple data items to one of multiple update frequency levels based on the update frequency of the corresponding data item; assigning a first set of the update frequency levels to a first persistent storage layer of multiple persistent storage layers, the assigning including: storing a first set of the multiple data items whose update frequency is within the first set of the update frequency levels at the first persistent storage layer; assigning a second set of the update frequency levels to a second persistent storage layer of the multiple persistent storage layers, the assigning including: storing a second set of the multiple data items whose update frequency is within the second set of the update frequency levels at the second persistent storage layer, the first persistent storage layer having a lower write latency than the second persistent storage layer; receiving a read request to obtain a specified data item from the database; and executing the read request at both the first persistent storage layer and the second persistent storage layer to obtain the specified data item, wherein executing the read request includes: obtaining the specified data item from the second persistent storage layer, and an update that is to be applied to the specified data item from the first persistent storage layer, merging, based on a merge function, the update with the specified data item to generate an updated specified data item, and returning the updated specified data item as the specified data item. 15. The computer-readable storage medium of claim 14 , wherein the first set of the update frequency levels represent an update frequency that exceeds the second set of the update frequency levels. 16. The computer-readable storage medium of claim 14 , wherein the first persistent storage layer is further categorized into multiple sub-layers, each of the sub-layers corresponding to one or more of the first set of the update frequency levels and stores a subset of the first set of the multiple data items corresponding to the one or more of the first set of the update frequency levels. 17. A system, comprising: a processor; a memory;
Update request formulation · CPC title
Clustering or classification · CPC title
Indexing; Web crawling techniques · CPC title
Physics · mapped topic
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.