Resumable merge sort

US10871945B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10871945-B2
Application numberUS-201815953000-A
CountryUS
Kind codeB2
Filing dateApr 13, 2018
Priority dateApr 13, 2018
Publication dateDec 22, 2020
Grant dateDec 22, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method may include receiving a database command to sort an unsorted dataset; dividing a sort operation, for sorting the unsorted dataset, into a plurality of portions; performing a first portion of the sort operation; persisting intermediate results from the first portion of the sort operation; and persisting a state of the sort operation identifying the portions of the sort operation have been performed.

First claim

Opening claim text (preview).

The invention claimed is: 1. A computer system comprising: a processor; and a computer-readable medium having stored thereon instructions that are executable by the processor to configure the computer system to perform the following: receive a database command to perform a sort operation on an unsorted dataset, wherein at least a portion of the unsorted dataset is stored in nonvolatile memory; determine a maximum chunk size for reading portions of the unsorted dataset from the nonvolatile memory; read a first chunk of the unsorted dataset from the nonvolatile memory to volatile memory, wherein the first chunk has a size less than or equal to the maximum chunk size; sort the first chunk to generate a first sorted chunk; persist the first sorted chunk to the nonvolatile memory; persist, in the nonvolatile memory, a state of the sort operation, the state of the sort operation comprising first information indicating progress of sorting chunks in the unsorted dataset and second information identifying at least one location in the nonvolatile memory where sorted chunks should be written; pause the sort operation after persisting the state of the sort operation; and resume the sort operation, after the sort operation was paused, based on the first information and the second information in the state of the sort operation. 2. The computer system of claim 1 , wherein persisting the state of the sort operation comprises persisting first metadata identifying the portion of the unsorted dataset that corresponds to the first sorted chunk. 3. The computer system of claim 2 , further comprising additional instructions that are executable by the processor to commit the first sorted chunk to a side table when persisting the first sorted chunk to the nonvolatile memory. 4. The computer system of claim 2 , wherein the maximum chunk size is based on a service level agreement. 5. The computer system of claim 4 , further comprising additional instructions that are executable by the processor to resume the sort operation, after the sort operation was paused, based on the persisted first metadata. 6. The computer system of claim 2 , wherein: sorting the first chunk to generate the first sorted chunk includes reading the first sorted chunk into one or more input buffers, reading the first sorted chunk into an output buffer, and persisting data in the output buffer to nonvolatile memory when the output buffer is full; and persisting the state of the sort operation includes persisting second metadata identifying a portion of the first sorted chunk that corresponds to the persisted data in the output buffer. 7. The computer system of claim 6 , further comprising additional instructions that are executable by the processor to: commit the first sorted chunk to a side table when persisting the first sorted chunk to the nonvolatile memory; and commit the data in the output buffer to the side table when persisting the data in the output buffer to the nonvolatile memory. 8. The computer system of claim 7 , further comprising additional instructions that are executable by the processor to resume the sort operation, after the sort operation was paused, based on the persisted second metadata. 9. The computer system of claim 8 , further comprising additional instructions that are executable by the processor to resume the sort operation, after the sort operation was paused, with a different number of computing threads performing the sort operation. 10. A method comprising: receiving a database command to sort an unsorted dataset; determining a maximum chunk size based at least in part on reducing information lost in event of failure and reducing intermediate merges; dividing a sort operation, for sorting the unsorted dataset, into a plurality of portions based on the maximum chunk size; performing a first portion of the sort operation; persisting intermediate results from the first portion of the sort operation; and persisting a state of the sort operation, the state of the sort operation comprising first information indicating progress of sorting chunks in the unsorted dataset and second information identifying at least one location in nonvolatile memory where sorted chunks should be written; pausing the sort operation after persisting the state of the sort operation; and resuming the sort operation, after the sort operation was paused, based on the first information and the second information in the state of the sort operation. 11. The method of claim 10 , wherein: performing the first portion of the sort operation includes reading a plurality of unsorted chunks of the unsorted dataset from a nonvolatile memory into a volatile memory, sorting each of the plurality of unsorted chunks to generate a corresponding plurality of sorted chunks, and persisting the corresponding plurality of sorted chunks to the nonvolatile memory; and persisting the state of the sort operation includes persisting first metadata identifying a portion of the unsorted dataset that corresponds to the persisted corresponding plurality of sorted chunks. 12. The method of claim 11 , wherein persisting the corresponding plurality of sorted chunks to the nonvolatile memory includes committing the corresponding plurality of sorted chunks to a side table. 13. The method of claim 11 , wherein reading the plurality of unsorted chunks includes reading the unsorted chunks to a maximum chunk size based on a service level agreement. 14. The method of claim 13 , further comprising resuming the sort operation, after the sort operation was paused, based on the persisted first metadata. 15. The method of claim 11 , wherein: performing the first portion of the sort operation includes reading the corresponding plurality of sorted chunks into a plurality of input buffers, merging the corresponding plurality of sorted chunks into an output buffer, and persisting data in the output buffer to the nonvolatile memory when the output buffer is full; and persisting the state of the sort operation includes persisting second metadata identifying a portion of a sorted chunk that corresponds to the persisted data in the output buffer. 16. The method of claim 15 , wherein: persisting the corresponding plurality of sorted chunks to the nonvolatile memory includes committing the corresponding plurality of sorted chunks to a side table; and persisting the data in the output buffer to the nonvolatile memory includes committing the data in the output buffer to the side table. 17. The method of claim 16 , further comprising resuming the sort operation, after the sort operation was paused, based on the persisted second metadata. 18. The method of claim 17 , further comprising resuming the sort operation, after the sort operation was paused, with a different number of computing threads performing the sort operation.

Assignees

Inventors

Classifications

  • Data buffering arrangements · CPC title

  • Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP] · CPC title

  • Intermediate data storage techniques for performance improvement · CPC title

  • Improving or facilitating administration, e.g. storage management · CPC title

  • G06F7/16Primary

    Combined merging and sorting · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10871945B2 cover?
A method may include receiving a database command to sort an unsorted dataset; dividing a sort operation, for sorting the unsorted dataset, into a plurality of portions; performing a first portion of the sort operation; persisting intermediate results from the first portion of the sort operation; and persisting a state of the sort operation identifying the portions of the sort operation have be…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F7/16. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 22 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).