Embedding codebooks for resource optimization
US-2019121884-A1 · Apr 25, 2019 · US
US11514003B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11514003-B2 |
| Application number | US-202117361096-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 28, 2021 |
| Priority date | Jul 17, 2020 |
| Publication date | Nov 29, 2022 |
| Grant date | Nov 29, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods, systems, and apparatus for, for data compression based on a key-value store. In one aspect, a method includes generating, at a server, a current dictionary based on a plurality of key-values stored in a storage system of the server; receiving a key-value pair transmitted by a client device; and performing, at the server, data compression on a key-value in the key-value pair by using the current dictionary; and storing the key-value in the storage system of the server.
Opening claim text (preview).
What is claimed is: 1. A method implemented in a one or more computers, the method comprising: generating, at a server, a current dictionary based on a plurality of key-value pairs stored in a storage system of the server; receiving one or more key-value pairs transmitted by client devices; and for each received key-value pair: performing, at the server, data compression on the received key-value pair by using the current dictionary to obtain a compressed key-value pair; and storing the compressed key-value pair in the storage system of the server; determining whether a compression efficiency of the storage system decreases; and in response to determining that the compression efficiency of the storage system decreases, updating the current dictionary by performing dictionary training, wherein the updating the current dictionary comprises: selecting N key-value pairs and M key-value pairs from the plurality of key-value pairs in the storage system, wherein N is an integer greater than 1 and M is an integer greater than 1, the N key-value pairs comprises a first training data set and the M Key-value pairs comprises a first verification data set, and the first training data set and the first verification data set are different; setting a plurality of groups of dictionary training parameters; performing dictionary training for a plurality of candidate dictionaries respectively based on the plurality of groups of dictionary training parameters and the first training data set; selecting a candidate dictionary with a highest compression efficiency based on the first training data set from the plurality of candidate dictionaries as a target dictionary; obtaining a compression efficiency of the target dictionary based on the first verification data set, wherein the compression efficiency of the target dictionary based on the first verification data set is determined by compressing the first verification data set using the target dictionary; determining whether a difference between the compression efficiency of the target dictionary based on the first training data set and the compression efficiency of the target dictionary based on the first verification data set exceeds a predetermined threshold; and in response to determining that the difference does not exceed the predetermined threshold, setting the target dictionary as the current dictionary. 2. The method of claim 1 , wherein the storage system comprises a cache storage system. 3. The method of claim 1 , wherein, when the difference between the compression efficiency of the target dictionary based on the first training data set and the compression efficiency of the target dictionary based on the first verification data set exceeds the predetermined threshold: extracting, at the server, a second training data set and a second verification data set; and updating the current dictionary by performing dictionary training based on the plurality of groups of dictionary training parameters, the second training data set and the second verification data set, wherein the second training data set comprises N 1 key-value pairs, the second verification data set comprise M 1 key-value pairs, N 1 is an integer greater than 1, M 1 is an integer greater than 1, wherein the second training data set is different to the first training data set, and the second verification data set is different to the first verification data set, and wherein the second training data set and the second verification data set are different. 4. The method of claim 1 , wherein M is equal to N. 5. The method of claim 1 , wherein determining that the compression efficiency of the storage system decreases, and updating the current dictionary, comprises: calculating overall compression efficiency of the current dictionary for the storage system at a current moment; determining that a decrease of the overall compression efficiency of the current dictionary for the storage system at the current moment relative to overall compression efficiency of the current dictionary at a previous moment exceeds a target threshold; and updating the current dictionary in response to the determination of the decrease. 6. The method of claim 5 , wherein the determining that the compression efficiency of the storage system decreases, and updating the current dictionary, comprises: decompressing a compressed key-value pair stored in the storage system by using a current dictionary before updating; and compressing, by using a current dictionary after updating, a key-value pair decompressed by the current dictionary before updating to obtain an updated compressed key-value pair. 7. The method of claim 1 , further comprising: reading a target key-value pair in the storage system. 8. The method of claim 7 , wherein the reading a target key-value pair in the storage system comprises: receiving a reading request for the target key-value pair transmitted by a target client device, wherein the client devices comprise the target client device, and the plurality of key-value pairs comprise the target key-value pair; decompressing a compressed key-value pair corresponding to the target key-value pair by using the current dictionary to obtain the target key-value pair; and transmitting the target key-value pair to the target client device. 9. A computer-implemented system, comprising: one or more computers realizing a server; and one or more computer memory devices interoperably coupled with the one or more computers and having non-transitory computer-readable storage media storing one or more instructions that, when executed by the one or more computers, perform one or more operations comprising: generating, at the server, a current dictionary based on a plurality of key-value pairs stored in a storage system of the server; receiving one or more key-value pairs transmitted by client devices; and for each received key-value pair: performing, at the server, data compression on the received key-value pair by using the current dictionary to obtain a compressed key-value pair; and storing the compressed key-value pair in the storage system of the server; determining whether a compression efficiency of the storage system decreases; and in response to determining that the compression efficiency of the storage system decreases, updating the current dictionary by performing dictionary training, wherein the updating the current dictionary comprises: selecting N key-value pairs and M key-value pairs from the plurality of key-value pairs in the storage system, wherein N is an integer greater than 1 and M is an integer greater than 1, the N key-value pairs comprises a first training data set and the M Key-value pairs comprises a first verification data set, and the first training data set and the first verification data set are different; setting a plurality of groups of dictionary training parameters; performing dictionary training for a plurality of candidate dictionaries respectively based on the plurality of groups of dictionary training parameters and the first training data set; selecting a candidate dictionary with a highest compression efficiency based on the first training data set from the plurality of candidate dictionaries as a target dictionary; obtaining a compression efficiency of the target dictionary based on the first verification data set, wherein the compression efficiency of the target dictionary based on the first verification data set is determined by compressing the first verification data set using the target dictionary; determining whether a difference between the compression efficiency of the target dictionary based on the first training data set and the compression efficiency of the target dictionary b
Updating · CPC title
Indexing; Data structures therefor; Storage structures · CPC title
Databases characterised by their database models, e.g. relational or object models · CPC title
Change logging, detection, and notification (replication G06F16/27) · CPC title
Design, administration or maintenance of databases · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.