Implicit checkpoint for generating a secondary index of a table

US10747739B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-10747739-B1
Application numberUS-201514859055-A
CountryUS
Kind codeB1
Filing dateSep 18, 2015
Priority dateSep 18, 2015
Publication dateAug 18, 2020
Grant dateAug 18, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A data storage system may implement implicit checkpoint for generating a secondary of a table. Indexing updates may be replicated and maintained across a replica group storing a table for a data store. Upon detection of a restart event for generating a secondary index, a replica in the replica group may evaluate the indexing updates to determine an index creation restart point according to an order for indexing the table. The generation of the secondary index may be resumed at the index creation restart point. In this way, secondary index generation may continue whether or not a previously indexing replica in the replica group, such as a master replica, is available to continue generating the secondary index.

First claim

Opening claim text (preview).

What is claimed is: 1. A system, comprising: a plurality of computing devices comprising respective processors and a memory to implement a plurality of storage nodes for a data store; one of the storage nodes is a master replica for a table stored in the data store and indexed according to a primary schema of a primary index, the master replica configured to: generate a plurality of indexing updates as part of creating a secondary index for the table that provides an alternative to the primary schema for accessing the table, wherein the indexing updates identify respective items from indexed portions of the table to be copied and stored together at the secondary index according to the alternative schema; send, as part of said creating the secondary index that provides the alternative schema for accessing the table, the indexing updates to one or more peer replicas at other ones of the storage nodes for the table; one of the storage nodes is one of the one or more peer replicas configured to: upon detection of a restart event for creating the secondary index that provides the alternative schema for accessing the table, wherein the restart event is triggered by an error or failure that interrupts said creating the secondary index: scan the respective items of the received indexing updates to determine an index creation restart point with respect to an order for indexing the table, wherein the index creation restart point is determined such that any remaining portions of the table to be indexed are identified with respect to the index creation restart point; resume creation of the secondary index that provides the alternative schema for accessing the table according to the index creation restart point; and index the identified remaining portions of the table to provide access to the table via the secondary index having the alternative schema. 2. The system of claim 1 , wherein the one peer replica is further configured to store the indexing updates as log records in a log; and wherein, to scan the respective items of the received indexing updates, the one peer replica is configured to: identify a most recent indexing update in the log of indexing updates; and set the index creation restart point as a value for the respective item indicated in the most recent indexing update. 3. The system of claim 1 , wherein during the creation of the secondary index the table is available for servicing one or more access requests. 4. The system of claim 1 , wherein the data store is a non-relational storage service, wherein the table is maintained for a client of the non-relational storage service, and wherein creation of the secondary index is performed in response to a request from the client to create the secondary index for the table. 5. A method, comprising: performing, by one or more computing devices: during creation of a secondary index for a table stored in a data store that provides an alternative to a primary schema of a primary index for accessing the table, maintaining a plurality of indexing updates directed to the secondary index that identify respective items from indexed portions of the table to be copied and stored together at the secondary index according to the alternative schema; upon detecting a restart event for the creation of the secondary index that provides the alternative schema for accessing the table, wherein the restart event is triggered by an error or failure the interrupts the creation of the secondary table: evaluating the respective items of the plurality of indexing updates according to an order for indexing the table to determine an index creation restart point, wherein the index creation restart point is determined such that any remaining portions of the table to be indexed are identified with respect to the index creation restart point; resuming creation of the secondary index that provides the alternative schema for accessing the table according to the index creation restart point; and indexing the identified remaining portions of the table to provide access to the table via the secondary index having the alternative schema. 6. The method of claim 5 , further comprising: generating, by a master replica of the table performing the creation of the secondary index, the plurality of indexing updates; and sending, by the master replica of the table, to one or more peer replicas of the table the plurality of indexing updates. 7. The method of claim 5 , wherein the plurality of indexing updates are maintained as respective log records in an update log for the table, and wherein evaluating the respective items of the plurality of indexing updates comprises: scanning the log records of the update log; identifying a most recent indexing update in the update log; and setting the index creation restart point as a value for the respective item indicated in the most recent indexing update. 8. The method of claim 7 , wherein evaluating the respective items of the plurality of indexing updates further comprises: prior to identifying the most recent indexing update in the update log, snipping one or more conflicting log records from the update log. 9. The method of claim 5 , wherein the table is available for servicing accessing requests during creation of the secondary index, and wherein the method further comprises: storing the index creation restart point in metadata for the table; in response to receiving a request to update a portion of the table, accessing the metadata to determine that the portion of the table is a previously indexed portion of the table; in response to determining that the portion of the table is a previously indexed portion: queuing the update to be performed at the secondary index; and performing the update at the table. 10. The method of claim 5 , wherein the maintaining, the evaluating and the resuming are performed by a peer replica of a replica group for the table, wherein the plurality of indexing updates are received from a master replica of the replica group, and wherein the restart event is a master replica failure event, and wherein the method further comprises: prior to evaluating the plurality of indexing updates, obtaining, by the peer replica, authority to act as a new master replica for the replica group from one or more other peer replicas of the replica group. 11. The method of claim 5 , wherein the maintaining, the evaluating, and the resuming are performed by a master replica of a replica group for the table, and wherein the restart event for generating the secondary index is a reboot of the master replica. 12. The method of claim 5 , wherein the table is stored in multiple partitions in the data store, wherein the maintaining, the evaluating, and the resuming are performed for one partition of the table. 13. The method of claim 5 , wherein the data store is a network-based storage service, wherein the table is maintained for a client of the network-based storage service, and wherein creation of the secondary index is performed in response to a request from the client to create the secondary index for the table. 14. A non-transitory, computer-readable storage medium, storing program instructions that when executed by one or more computing devices cause the one or more computing devices to implement: during creation of a secondary index for a table stored in a distributed data store that provides an alternative to a primary schema of a primary index for accessing the table, storing a plurality of indexing updates directed to the secondary index that identify respective items from indexed portions of the table to be copied and stored together a

Assignees

Inventors

Classifications

  • Management thereof · CPC title

  • Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor · CPC title

  • Hierarchical databases, e.g. IMS, LDAP data stores or Lotus Notes · CPC title

  • Indexing structures · CPC title

  • Updating · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10747739B1 cover?
A data storage system may implement implicit checkpoint for generating a secondary of a table. Indexing updates may be replicated and maintained across a replica group storing a table for a data store. Upon detection of a restart event for generating a secondary index, a replica in the replica group may evaluate the indexing updates to determine an index creation restart point according to an o…
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/2272. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 18 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).