Size targeted database I/O compression

US9575982B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9575982-B1
Application numberUS-201313872996-A
CountryUS
Kind codeB1
Filing dateApr 29, 2013
Priority dateApr 29, 2013
Publication dateFeb 21, 2017
Grant dateFeb 21, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Solid-state storage devices may be employed to store data maintained by a database management system, but may have characteristics that reduce the efficiency of interactions between the database management system and the device. A storage subsystem may receive information indicative of internal boundaries within database data. A segment of the database data may be selected for compression, wherein the size of the segment is based at least on one or more the internal boundaries, the memory page size of the solid-state drive, and a predicted rate of compression. The compressed segment may be stored if it has a size less than the memory page size of the device. If it does not, compression may be retried with a smaller segment of data or a portion of the data may be stored in uncompressed form. Additional segments of the data may be stored on the solid-state drive in a similar manner.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for storing data, the system comprising: one or more computing devices configured as a database management system; a storage device comprising a plurality of memory pages; and one or more memories having stored thereon computer-readable instructions that, upon execution, cause the system at least to: receive, from a component of the database management system, information indicative of one or more boundaries within the data; determine a first amount of the data to compress, wherein the first amount of the data is based at least in part on a memory page size of a memory page of the plurality of memory pages, a value indicative of a predicted rate of compression, and the information indicative of one or more boundaries within the data; compress a first segment of the data, wherein the first segment is of a size based at least in part on the determined first amount of the data, to form a compressed segment of a second size; and store the compressed segment in the memory page of the plurality of memory pages when the second size is less than the memory page size. 2. The system of claim 1 , further comprising one or more memories having stored thereon computer-readable instructions that, upon execution, cause the system at least to: store a subset of the first segment when the second size is greater than the memory page size. 3. The system of claim 1 , wherein at least one of the one or more boundaries is indicative of a database row or an item boundary. 4. The system of claim 1 , wherein the storage device comprises a solid-state drive. 5. A method for storing data on a storage device, the method comprising: determining a first size of a first segment of data to compress based at least in part on a value indicative of a compression rate and a page size of each of one or more memory pages of the storage device; compressing the first segment of the data, the first segment being of the first size, to form a compressed segment of a second size; and storing the compressed segment in a memory page of the one or more memory pages when the second size is less than the page size. 6. The method of claim 5 , further comprising: updating the value indicative of the compression rate based at least in part on the second size. 7. The method of claim 5 , further comprising: storing an uncompressed subset of the first segment of the data in a memory page of the one or more memory pages when the second size is greater than the page size. 8. The method of claim 5 , further comprising: determining to compress a second segment of the data in addition to the first segment of the data based at least in part on the second size of the compressed first segment being less than the page size page; and compressing the second segment of the data in addition to the first segment of data. 9. The method of claim 5 , further comprising: determining to form a second compressed segment based at least in part on one or more of an elapsed time to compress the first segment of the data, a number of attempts to compress the first segment of the data, and the first size of the first segment of the data; and form the second compressed segment. 10. The method of claim 5 , wherein the one or more memory pages corresponds to an erase block. 11. The method of claim 5 , further comprising applying a Lempel-Ziv compression algorithm to the first segment of the data. 12. The method of claim 5 , further comprising selecting one or more of a compression algorithm and compression parameters based at least in part on the page size. 13. The method of claim 5 , further comprising: writing uncompressed header information indicative of a compression status to the one or more memory pages. 14. The method of claim 5 , further comprising determining the first size of the first segment of the data based at least in part on a probability of achieving the compression rate. 15. The method of claim 5 , further comprising: receiving, from a component of an application, information indicative of one or more internal boundaries in the data; and determining the first size of the first segment of the data based at least in part on the one or more internal boundaries. 16. A non-transitory computer-readable storage medium having stored thereon instructions that, upon execution by a computing device, cause the computing device to at least: determine a first size of a first segment of data to compress for storage on a storage device, wherein the first size of the first segment of the data is determined based at least in part on a page size of each of one or more memory pages of the storage device and a value indicative of a predicted compression rate; compress the first segment of the data to form a compressed segment of a second size; and store the compressed segment when the second size is less than the page size. 17. The computer-readable medium of claim 16 , having stored thereon further instructions that, upon execution by the computing device, cause the computing device to at least: update the value indicative of the compression rate based at least in part on the second size. 18. The computer-readable medium of claim 16 , having stored thereon further instructions that, upon execution by the computing device, cause the computing device to at least: determine to compress the first segment of the data and a second segment of the data based at least in part on the second size of the compressed segment being less than the page size. 19. The computer-readable medium of claim 16 , having stored thereon further instructions that, upon execution by the computing device, cause the computing device to at least: receive information from an application component indicative of a boundary within the data. 20. The computer-readable medium of claim 19 , having stored thereon further instructions that, upon execution by the computing device, cause the computing device to at least: determining the first size of the first segment of the data based at least in part on aligning the first segment with the boundary. 21. The computer-readable medium of claim 16 , having stored thereon further instructions that, upon execution by the computing device, cause the computing device to at least: store a subset of the first segment of the data when the second size of the compressed segment is greater than the page size. 22. The computer-readable medium of claim 16 , having stored thereon further instructions that, upon execution by the computing device, cause the computing device to at least: determine a probability of achieving the compression rate. 23. The computer-readable medium of claim 16 , having stored thereon further instructions that, upon execution by the computing device, cause the computing device to at least: select parameters for a compression algorithm based at least in part on the page size. 24. A system for storing data comprising: a storage device; one or more memories having stored thereon computer-readable instructions that, upon execution, cause the system at least to: determine a first size of a first subset of the data, based at least in part on a size of a memory page of the storage device, and a value indicative of a compression rate, wherein a compressed subset of the data to be formed by compressing the first subset of the data is predicted to have a second size less than the size of the memory page; form the compressed sub

Assignees

Inventors

Classifications

  • Physics · mapped topic

  • Tablespace storage structures; Management thereof · CPC title

  • User-Defined Types; Storage management thereof · CPC title

  • Hybrid storage device · CPC title

  • in relation to data integrity, e.g. data losses, bit errors · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9575982B1 cover?
Solid-state storage devices may be employed to store data maintained by a database management system, but may have characteristics that reduce the efficiency of interactions between the database management system and the device. A storage subsystem may receive information indicative of internal boundaries within database data. A segment of the database data may be selected for compression, wher…
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06F17/30153. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 21 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).