System and method for transaction continuity across failures in a scale-out database

US11983170B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11983170-B2
Application numberUS-202318117810-A
CountryUS
Kind codeB2
Filing dateMar 6, 2023
Priority dateOct 14, 2020
Publication dateMay 14, 2024
Grant dateMay 14, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

One or more engine instances are executed on each host to form an engine cluster. A plurality of control instances are executed on a first set of hosts to form a control cluster and comprise a control instance leader and one or more control instance followers. In response to a first host indicating a failure of a neighbor host, a pair-wise focused investigation is initiated to check peer-to-peer connections between the first host and the neighbor host. In response to one or more additional hosts indicating failures of neighbor hosts while the pair-wise focused investigation is being performed, a wide investigation is performed to check connections between the control cluster and the plurality of hosts. One or more hosts are added to an eviction list and an eviction protocol is performed to evict the one or more hosts from the engine cluster using the eviction list.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: on each host of a plurality of hosts, executing one or more engine instances of a plurality of engine instances to form an engine cluster, wherein each host of the plurality of hosts is a computing device; establishing neighbor relationships among the plurality of hosts; executing a plurality of control instances on a first set of hosts of the plurality of hosts to form a control cluster, wherein: each control instance executes on a distinct host of the first set of hosts, the plurality of control instances maintain data indicating the neighbor relationships between the plurality of hosts, and the plurality of control instances comprise a control instance leader and one or more control instance followers; in response to a first host within the plurality of hosts indicating a failure of a neighbor host, initiating a pair-wise focused investigation to check peer-to-peer connections between the first host and the neighbor host; in response to one or more additional hosts within the plurality of hosts indicating failures of neighbor hosts while the pair-wise focused investigation is being performed, performing a wide investigation to check connections between the control cluster and the plurality of hosts; adding one or more hosts to an eviction list based on results of the pair-wise focused investigation or the wide investigation; and performing an eviction protocol to evict the one or more hosts from the engine cluster using the eviction list. 2. The method of claim 1 , wherein the pair-wise focused investigation comprises: determining whether the first host can reach the control instance leader; and in response to a determination that the first host cannot reach the control instance leader, adding the first host to the eviction list. 3. The method of claim 2 , wherein the first host ceases to process client requests for data until recovery is performed to add the first host back into the engine cluster. 4. The method of claim 2 , wherein the pair-wise focused investigation comprises: in response to a determination that the first host can reach the control instance leader, attempting contact, by the control instance leader, with the neighbor host; and in response to a determination that the control instance leader cannot communicate with the neighbor host, adding the neighbor host to the eviction list. 5. The method of claim 4 , wherein the pair-wise focused investigation comprises: in response to a determination that the control instance leader can communicate with the neighbor host, determining whether the first host and the neighbor host have access to the same network; in response to a determination that the first host and the neighbor host do not have access to the same network, using a tie-breaking heuristic to determine which of the first host and the neighbor host is to be evicted; and adding the first host or the neighbor host to the eviction list based on results of the tie-breaking heuristic. 6. The method of claim 5 , wherein the pair-wise focused investigation comprises: in response to a determination that the first host and the neighbor host have access to the same network, performing, by at least one control instance follower from the one or more control instance followers, a probing operation on the first host and the neighbor host; determining a connectivity score for each of the first host and the neighbor host based on results of the probing operation; and adding the first host or the neighbor host to the eviction list based the connectivity scores of the first host and the neighbor host. 7. The method of claim 6 , wherein the connectivity score is based on a number of members of the control cluster can communicate with the first host or the neighbor host. 8. The method of claim 6 , wherein the pair-wise focused investigation comprises: in response to a determination that the connectivity score of the first host and the connectivity score of the neighbor host are not equal, adding the first host or the neighbor host having a lower connectivity score to the eviction list; and in response to a determination that the connectivity score of the first host and the connectivity score of the neighbor host are equal, using the tie-breaking heuristic to determine which of the first host and the neighbor host is to be evicted; and adding the first host or the neighbor host to the eviction list based on results of the tie-breaking heuristic. 9. The method of claim 1 , wherein the wide investigation comprises: classifying connections between the control instance leader and each host of the plurality of hosts; and adding a subset of the plurality of hosts to the eviction list based on results of classifying. 10. The method of claim 9 , wherein evictions of the subset of the plurality of hosts are batched. 11. The method of claim 9 , wherein classifying the connections between the control instance leader and each host of the plurality of hosts comprises classifying hosts that are unreachable into a subset of unreachable hosts, wherein the subset of the plurality of hosts includes the subset of unreachable hosts. 12. The method of claim 9 , wherein classifying the connections between the control instance leader and each host of the plurality of hosts comprises: classifying one or more hosts to which the control instance leader has indirect access through a single network within a plurality of networks as a subset of fringe hosts; identifying, by the control instance leader, at least one host within the subset of fringe hosts to be evicted; and adding the identified at least one host to the eviction list. 13. The method of claim 12 , wherein identifying the at least one host within the subset of fringe hosts to be evicted comprises identifying at least one host that does not include a control instance. 14. The method of claim 12 , wherein: the first set of hosts in the control cluster are connected using a first network, the plurality of hosts in the engine cluster are connected using a second network, and identifying the at least one host within the subset of fringe hosts to be evicted comprises identifying at least one host connected to the control instance leader using the first network. 15. The method of claim 14 , wherein classifying the connections between the control instance leader and each host of the plurality of hosts comprises classifying at least one host connected to the control instance leader using the first network and the second network as a subset of fully connected hosts. 16. One or more non-transitory storage media storing instructions which, when executed by one or more computing devices, cause performance of a method comprising: on each host of a plurality of hosts, executing one or more engine instances of a plurality of engine instances to form an engine cluster, wherein each host of the plurality of hosts is a computing device; establishing neighbor relationships among the plurality of hosts; executing a plurality of control instances on a first set of hosts of the plurality of hosts to form a control cluster, wherein: each control instance executes on a distinct host of the first set of hosts, the plurality of control instances maintain data indicating the neighbor relationships between the plurality of hosts, and the plurality of control instances comprise a control instance leader and one or more control instance followers; in response to a first host within the plurality of hosts indicating a failure of a neighbor host, initiating a pair-wise focused investigation to ch

Assignees

Inventors

Classifications

  • Updates performed during online database operations; commit processing · CPC title

  • Clustering or classification · CPC title

  • G06F16/278Primary

    Data partitioning, e.g. horizontal or vertical partitioning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11983170B2 cover?
One or more engine instances are executed on each host to form an engine cluster. A plurality of control instances are executed on a first set of hosts to form a control cluster and comprise a control instance leader and one or more control instance followers. In response to a first host indicating a failure of a neighbor host, a pair-wise focused investigation is initiated to check peer-to-pee…
Who is the assignee on this patent?
Oracle Int Corp
What technology area does this patent fall under?
Primary CPC classification G06F16/2379. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 14 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).