Data compression based on key-value store

US2022019562A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2022019562-A1
Application numberUS-202117361096-A
CountryUS
Kind codeA1
Filing dateJun 28, 2021
Priority dateJul 17, 2020
Publication dateJan 20, 2022
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and apparatus for, for data compression based on a key-value store. In one aspect, a method includes generating, at a server, a current dictionary based on a plurality of key-values stored in a storage system of the server; receiving a key-value pair transmitted by a client device; and performing, at the server, data compression on a key-value in the key-value pair by using the current dictionary; and storing the key-value in the storage system of the server.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method implemented in a one or more computers, the method comprising: generating, at a server, a current dictionary based on a plurality of key-values stored in a storage system of the server; receiving a key-value pair transmitted by a client device; and performing, at the server, data compression on a key-value in the key-value pair by using the current dictionary; storing the key-value in the storage system of the server. 2 . The method of claim 1 , wherein the storage system comprises a cache storage system. 3 . The method of claim 1 , wherein the generating a current dictionary based on a plurality of key-values in a storage system of the server comprises: selecting N key-values from the plurality of key-values in the storage system as training data, wherein N is an integer greater than 1; setting a dictionary training parameter; performing dictionary training based on the dictionary training parameter and the training data to obtain a plurality of candidate dictionaries; and selecting a candidate dictionary with the highest compression efficiency from the plurality of candidate dictionaries as the current dictionary. 4 . The method of claim 3 , wherein the generating a current dictionary based on a plurality of key-values in the storage system of a server further comprises: selecting M key-values from the plurality of key-values in the storage system as verification data, wherein M is an integer greater than 1; and the selecting the candidate dictionary with the highest compression efficiency from the plurality of candidate dictionaries as the current dictionary comprises: selecting the candidate dictionary with the highest compression efficiency from the plurality of candidate dictionaries as a target dictionary; verifying compression efficiency of the target dictionary based on the verification data; and using the target dictionary as the current dictionary. 5 . The method of claim 4 , wherein M is equal to N. 6 . The method of claim 1 , further comprising: determining that compression efficiency of the storage system decreases, and updating the current dictionary. 7 . The method of claim 6 , wherein the determining that compression efficiency of the storage system decreases, and updating the current dictionary comprises: calculating overall compression efficiency of the current dictionary for the storage system at a current moment; determining that a decrease of the overall compression efficiency of the current dictionary for the storage system at the current moment relative to overall compression efficiency at a previous moment exceeds a target threshold; and updating the current dictionary in response to the determination of the decrease. 8 . The method of claim 7 , wherein the updating the current dictionary comprises: generating a candidate updated dictionary based on a plurality of key-values in the storage system of the server at the current moment; compressing the plurality of key-values in the storage system by using the candidate updated dictionary; determining that overall compression efficiency of the candidate updated dictionary for the storage system is higher than that of the current dictionary for the storage system at the current moment; and using the candidate updated dictionary as the current dictionary in response to the determination of the overall compression efficiency. 9 . The method of claim 7 , wherein the determining that compression efficiency of the storage system decreases, and updating the current dictionary further comprises: decompressing a compressed key-value stored in the storage system by using a current dictionary before updating; and compressing, by using a current dictionary after updating, the key-value decompressed by the current dictionary before updating to obtain an updated compressed key-value. 10 . The method of claim 1 , further comprising: reading a target key-value in the storage system. 11 . The method of claim 10 , wherein the reading a target key-value in the storage system comprises: receiving a reading request for the target key-value transmitted by a target client device, wherein the client device comprises the target client device, and the plurality of key-values comprise the target key-value; decompressing a compressed key-value corresponding to the target key-value by using the current dictionary to obtain the target key-value; and transmitting the target key-value to the target client device. 12 . A computer-implemented system, comprising: one or more computers realizing a server; and one or more computer memory devices interoperably coupled with the one or more computers and having tangible, non-transitory, machine-readable media storing one or more instructions that, when executed by the one or more computers, perform one or more operations comprising: generating, at the server, a current dictionary based on a plurality of key-values stored in a storage system of the server; receiving a key-value pair transmitted by a client device; performing, at the server, data compression on a key-value in the key-value pair by using the current dictionary; and storing the key-value in the storage system of the server. 13 . The system of claim 12 , wherein the storage system comprises a cache storage system. 14 . The system of claim 12 , wherein the generating a current dictionary based on a plurality of key-values in a storage system of the server comprises: selecting N key-values from the plurality of key-values in the storage system as training data, wherein N is an integer greater than 1; setting a dictionary training parameter; performing dictionary training based on the dictionary training parameter and the training data to obtain a plurality of candidate dictionaries; and selecting a candidate dictionary with the highest compression efficiency from the plurality of candidate dictionaries as the current dictionary. 15 . The system of claim 14 , wherein the generating a current dictionary based on a plurality of key-values in the storage system of a server further comprises: selecting M key-values from the plurality of key-values in the storage system as verification data, wherein M is an integer greater than 1; and the selecting the candidate dictionary with the highest compression efficiency from the plurality of candidate dictionaries as the current dictionary comprises: selecting the candidate dictionary with the highest compression efficiency from the plurality of candidate dictionaries as a target dictionary; verifying compression efficiency of the target dictionary based on the verification data; and using the target dictionary as the current dictionary. 16 . The system of claim 15 , wherein M is equal to N. 17 . The system of claim 12 , further comprising the operations of: determining that compression efficiency of the storage system decreases, and updating the current dictionary. 18 . The system of claim 17 , wherein the determining that compression efficiency of the storage system decreases, and updating the current dictionary comprises: calculating overall compression efficiency of the current dictionary for the storage system at a current moment; determining that a decrease of the overall compression efficiency of the current dictionary for the storage system at the current moment relative to overall compression efficiency at a previous moment exceeds a target threshold; and updating the current dictionary in response to the determination of the d

Assignees

Inventors

Classifications

  • Updating · CPC title

  • G06F16/21Primary

    Design, administration or maintenance of databases · CPC title

  • Ensuring data consistency and integrity · CPC title

  • Change logging, detection, and notification (replication G06F16/27) · CPC title

  • G06F16/28Primary

    Databases characterised by their database models, e.g. relational or object models · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2022019562A1 cover?
Methods, systems, and apparatus for, for data compression based on a key-value store. In one aspect, a method includes generating, at a server, a current dictionary based on a plurality of key-values stored in a storage system of the server; receiving a key-value pair transmitted by a client device; and performing, at the server, data compression on a key-value in the key-value pair by using th…
Who is the assignee on this patent?
Alipay Hangzhou Inf Tech Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06F16/21. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jan 20 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).