What technology area does this patent fall under?

Primary CPC classification G06F16/2365. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 07 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

System and method for transaction continuity across failures in a scale-out database

US11599421B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11599421-B2
Application number	US-202017137745-A
Country	US
Kind code	B2
Filing date	Dec 30, 2020
Priority date	Oct 14, 2020
Publication date	Mar 7, 2023
Grant date	Mar 7, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A shared-nothing database system is provided in which parallelism and workload balancing are increased by assigning the rows of each table to “slices”, and storing multiple copies (“duplicas”) of each slice across the persistent storage of multiple nodes of the shared-nothing database system. When the data for a table is distributed among the nodes of a shared-nothing system in this manner, requests to read data from a particular row of the table may be handled by any node that stores a duplica of the slice to which the row is assigned. For each slice, a single duplica of the slice is designated as the “primary duplica”. All DML operations (e.g. inserts, deletes, updates, etc.) that target a particular row of the table are performed by the node that has the primary duplica of the slice to which the particular row is assigned. The changes made by the DML operations are then propagated from the primary duplica to the other duplicas (“secondary duplicas”) of the same slice.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: assigning rows of a table to a plurality of slices; wherein each row of the table is assigned to a single slice of the plurality of slices; for each slice of the plurality of slices, storing a plurality of duplicas; wherein each duplica of each slice contains rows of the table that belong to the slice; wherein, for each slice of the plurality of slices, storing the plurality of duplicas includes: storing a primary duplica in one persistent storage of a plurality of persistent storages; and storing one or more secondary duplicas in one or more persistent storages of the plurality of persistent storages; wherein each persistent storage of the plurality of persistent storages is local to and directly accessible only by a single engine instance of a plurality of engine instances; wherein each engine instance of the plurality of engine instances includes logic for accessing rows stored in the table; wherein the plurality of persistent storages includes a first persistent storage and a second persistent storage; wherein the first persistent storage stores the primary duplica for a particular slice that includes data for a particular row of the table; wherein the second persistent storage stores a particular secondary duplica for the particular slice; receiving a request to perform a transaction that affects data in the particular row of the table; initiating execution of the transaction at a first engine instance that is local to the first persistent storage; in response to the first engine instance ceasing to function prior to committing the transaction: establishing the particular secondary duplica on the second persistent storage as a new primary duplica for the particular slice; and causing the transaction to be resumed and committed by a second engine instance that is local to the second persistent storage. 2. The method of claim 1 wherein: the transaction is a multi-statement transaction; the request to perform the transaction includes a series of statement-level requests; and each statement-level request is a request to perform a corresponding statement of the multi-statement transaction. 3. The method of claim 1 wherein: the particular secondary duplica is one of a plurality of secondary duplicas of the particular slice; and establishing the particular secondary duplica as the new primary duplica is performed based on the particular secondary duplica having more transaction log records for the transaction than any other of the plurality of secondary duplicas of the particular slice. 4. The method of claim 3 further comprising, prior to the second engine instance resuming the transaction, transmitting to each secondary duplica, of the other of the plurality of secondary duplicas, any transaction log records that are missing at the secondary duplica. 5. The method of claim 1 further comprising: while the first engine instance is executing the transaction causing a particular change made to data in the primary duplica to be propagated to the particular secondary duplica; wherein causing the transaction to be resumed by the second engine instance is performed without the second engine instance repeating execution of the particular change. 6. The method of claim 5 wherein: the request to perform the transaction is received from a client; the method further comprises: after initiating propagation of the particular change to the particular secondary duplica, the first engine instance communicates to the client that the particular change was successfully executed; the client stores transaction status data that indicates that the particular change was successfully executed; causing the transaction to be resumed by the second engine instance includes the client sending to the second engine instance requests to only perform portions of the transaction that the first engine instance did not report as successfully executed. 7. The method of claim 5 wherein: the request to perform the transaction is received from a client; the method further comprises: after initiating propagation of the particular change to the secondary duplica, the first engine instance communicates to the client that the particular change was successfully executed; the client stores transaction status data that indicates that the particular change was successfully executed; after the first engine instance ceases to function, the client sends the transaction status data to the second engine instance; and based on the transaction status data, the second engine instance does not repeat the particular change. 8. The method of claim 7 wherein the transaction status data sent from the client to the second engine instance identifies a highest statement number that was confirmed-executed by the first engine instance. 9. The method of claim 8 wherein causing the particular change to be propagated to the particular secondary duplica includes sending the particular change to a Network Interface Card (NIC) that is local to the first engine instance and, in response to the NIC successfully putting a message containing the change on transmission media, communicating to the client that the particular change was propagated to the particular secondary duplica without waiting for acknowledgement that the particular change was successfully propagated to the particular secondary duplica. 10. The method of claim 9 further comprising, prior to the second engine instance resuming the transaction: determining whether the particular secondary duplica has all log records of the transaction up to and including log records for the highest statement number that was confirmed-executed by the first engine instance; and if the particular secondary duplica does not have all log records of the transaction up to and including log records for the highest statement number that was confirmed-executed by the first engine instance, then prior to the second engine instance resuming the transaction, storing in the particular secondary duplica any missing log records of the transaction up to and including log records for the highest statement number that was confirmed-executed by the first engine instance. 11. The method of claim 9 wherein the NIC is connected to a host that is executing the second engine instance by at least two distinct networks. 12. A method for resuming a transaction, comprising: using a first engine instance to coordinate execution of the transaction; wherein the transaction involves making updates at a plurality of participants; after all statements in the transaction have been executed, the first engine instance sending, directly or indirectly, prepare request messages to each participant of a plurality participants in the transaction; upon receiving prepare acknowledgement messages, directly or indirectly, from each participant of the plurality of participants, the first engine instance selecting a candidate commit time; initiating transmission of the candidate commit time to one or more backup coordinators; wherein the one or more backup coordinators include a second engine instance; in response to the first engine instance failing after selecting a candidate commit time and before the second engine instance has received the candidate commit time, establishing the second engine instance as a new transaction coordinator for the transaction; the second engine instance resuming the transaction by sending, directly or indirectly, prepare messages to each participant of the plurality of participants; upon receiving prepare acknowledgement messages, directly or indirectly from each participant of the plurality of partic

Assignees

Oracle Int Corp

Inventors

Classifications

G06F16/278
Data partitioning, e.g. horizontal or vertical partitioning · CPC title
G06F16/221
Column-oriented storage; Management thereof · CPC title
G06F16/2365Primary
Ensuring data consistency and integrity · CPC title
G06F2201/84
Using snapshots, i.e. a logical point-in-time copy of the data · CPC title
G06F11/1458Primary
Management of the backup or restore process · CPC title

Patent family

Related publications grouped by family.

View patent family 78516914

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11599421B2 cover?: A shared-nothing database system is provided in which parallelism and workload balancing are increased by assigning the rows of each table to “slices”, and storing multiple copies (“duplicas”) of each slice across the persistent storage of multiple nodes of the shared-nothing database system. When the data for a table is distributed among the nodes of a shared-nothing system in this manner, req…
Who is the assignee on this patent?: Oracle Int Corp
What technology area does this patent fall under?: Primary CPC classification G06F16/2365. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 07 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

System and method for rapid fault detection and repair in a shared nothing distributed database

System and method for an ultra highly available, high performance, persistent memory optimized, scale-out database

System and method for high performance multi-statement interactive transactions with snapshot isolation in a scale-out database

Systems and methods for querying a database

Sharding of full and incremental snapshots

Frequently asked questions