Identifying and resolving differences between datastores

US2022188288A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2022188288-A1
Application numberUS-202017117900-A
CountryUS
Kind codeA1
Filing dateDec 10, 2020
Priority dateDec 10, 2020
Publication dateJun 16, 2022
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques for efficiently maintaining consistency of data items across storage partitions are disclosed using a hierarchical multi-level hash tree. Copies of a data item may be associated with corresponding attributes that are used to generate hash values for the data item. Hash values of the attributes may then be used to label nodes in a multi-level hash tree. Differences between the replicated copies of a data item may be quickly identified by comparing hash values associated with successively lower peer nodes in corresponding hash trees. Once identified, systems may update versions of a data item that are no longer current.

First claim

Opening claim text (preview).

What is claimed is: 1 . One or more non-transitory computer-readable media storing instructions, which when executed by one or more hardware processors, cause performance of operations comprising: identifying a plurality of attributes of a first data item of a plurality of data items, wherein the plurality of attributes comprise at least one of: an account identifier identifying an account associated with the first data item; a time associated with a data transaction event executed on the first data item; a table identifier identifying a data structure associated with the first data item; or a data storage tier identifier identifying storage of the first data item in an active tier or inactive tier; applying a hash function to the plurality of attribute values to compute a first hash value; mapping the first hash value to a first leaf node of a first plurality of leaf nodes of a first multi-level hash tree; storing an association between the first data item and the first leaf node; updating a node hash value corresponding to the first leaf node based on the first data item; and updating node hash values corresponding to nodes of the first leaf node in the first multi-level hash tree that are directly or indirectly connected to the first leaf node, wherein updating each node hash value includes updating a hash value for at least one node at each level of the multi-level hash tree. 2 . The one or more media of claim 1 , further comprising: comparing node hash values corresponding to the first multi-level hash tree to node hash values corresponding to a second multi-level hash tree until a difference is identified between (a) the node hash value corresponding to the first leaf node of the first plurality of leaf nodes of the first multi-level hash tree and (b) a second leaf node in a second plurality of leaf nodes of the second multi-level hash tree that is a peer of the first leaf node; responsive to identifying the difference: identifying the first data item based on the association between the first data item and the first leaf node; determining that the first data item, associated with the first leaf node, is an updated version of a second data item associated with the second leaf node; and updating the second data item to match the first data item. 3 . The one or more media of claim 2 , wherein the comparing operation comprises: selecting, for comparison, a target node of the first multi-level hash tree and a peer target node of the second multi-level hash tree; responsive to determining that a first node hash value of the target node of the first multi-level hash tree and a second node hash value of the peer node of the second multi-level hash tree are different, executing a descending comparison of hash values of child nodes of the target node to hash values of peer child nodes of the second multi-level hash tree; and wherein the child nodes of the first multi-level hash tree are selected more frequently in a first descending direction for comparison to the peer child nodes of the second multi-level hash tree than in a second descending direction. 4 . The one or more media of claim 3 , wherein child nodes of the first multi-level hash tree in the first descending direction are selected from two times to four times more often than child nodes in the second descending direction. 5 . The one or more media of claim 1 , wherein the operations further comprise determining a storage location for storing the first data item based on a second hash value corresponding to the plurality of attributes. 6 . The one or more media of claim 1 , wherein data items corresponding to first leaf node are stored in a particular memory region. 7 . The one or more media of claim 6 , wherein the particular memory region is identified in the first leaf node using a data item identifier. 8 . The one or more media of claim 1 , wherein the plurality of attributes comprises the account identifier. 9 . The one or more media of claim 1 , wherein the plurality of attributes comprises the time associated with a data transaction event. 10 . The one or more media of claim 1 , wherein the plurality of attributes comprises the table identifier. 11 . The one or more media of claim 1 , wherein the plurality of attributes comprises the data storage tier identifier. 12 . A method comprising: identifying a plurality of attributes of a first data item of a plurality of data items, wherein the plurality of attributes comprise at least one of: an account identifier identifying an account associated with the first data item; a time associated with a data transaction event executed on the first data item; a table identifier identifying a data structure associated with the first data item; or a data storage tier identifier identifying storage of the first data item in an active tier or inactive tier; applying a hash function to the plurality of attribute values to compute a first hash value; mapping the first hash value to a first leaf node of a first plurality of leaf nodes of a first multi-level hash tree; storing an association between the first data item and the first leaf node; updating a node hash value corresponding to the first leaf node based on the first data item; and updating node hash values corresponding to nodes of the first leaf node in the first multi-level hash tree that are directly or indirectly connected to the first leaf node, wherein updating each node hash value includes updating a hash value for at least one node at each level of the multi-level hash tree. 13 . The one or more media of claim 12 , further comprising: comparing node hash values corresponding to the first multi-level hash tree to node hash values corresponding to a second multi-level hash tree until a difference is identified between (a) the node hash value corresponding to the first leaf node of the first plurality of leaf nodes of the first multi-level hash tree and (b) a second leaf node in a second plurality of leaf nodes of the second multi-level hash tree that is a peer of the first leaf node; responsive to identifying the difference: identifying the first data item based on the association between the first data item and the first leaf node; determining that the first data item, associated with the first leaf node, is an updated version of a second data item associated with the second leaf node; and updating the second data item to match the first data item. 14 . The one or more media of claim 13 , wherein the comparing operation comprises: selecting, for comparison, a target node of the first multi-level hash tree and a peer target node of the second multi-level hash tree; responsive to determining that a first node hash value of the target node of the first multi-level hash tree and a second node hash value of the peer node of the second multi-level hash tree are different, executing a descending comparison of hash values of child nodes of the target node to hash values of peer child nodes of the second multi-level hash tree; and wherein the child nodes of the first multi-level hash tree are selected more frequently in a first descending direction for comparison to the peer child nodes of the second multi-level hash tree than in a second descending direction. 15 . The one or more media of claim 14 , wherein child nodes of the first multi-level hash tree in the first descending direction are selected from two times to four times more often than child nodes in the second descending direction. 16 . The one or more media of claim 12 , wherein the operations furth

Assignees

Inventors

Classifications

  • Updates performed during online database operations; commit processing · CPC title

  • Hash tables · CPC title

  • Ensuring data consistency and integrity · CPC title

  • Trees, e.g. B+trees · CPC title

  • Techniques for file synchronisation in file systems · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2022188288A1 cover?
Techniques for efficiently maintaining consistency of data items across storage partitions are disclosed using a hierarchical multi-level hash tree. Copies of a data item may be associated with corresponding attributes that are used to generate hash values for the data item. Hash values of the attributes may then be used to label nodes in a multi-level hash tree. Differences between the replica…
Who is the assignee on this patent?
Oracle Int Corp
What technology area does this patent fall under?
Primary CPC classification G06F16/2365. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jun 16 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).