Hardware acceleration for a compressed computation database

US10831713B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10831713-B2
Application numberUS-201715791770-A
CountryUS
Kind codeB2
Filing dateOct 24, 2017
Priority dateOct 3, 2014
Publication dateNov 10, 2020
Grant dateNov 10, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

According to embodiments of the present invention, machines, systems, methods and computer program products for hardware acceleration are presented. A plurality of computational nodes for processing data is provided, each node performing a corresponding operation for data received at that node. A metric module is used to determine a compression benefit metric pertaining to performance of the corresponding operations of one or more computational nodes with recompressed data. An accelerator module recompresses data for processing by the one or more computational nodes based on the compression benefit metric indicating a benefit gained by using the recompressed data. A distribution function may be used to distribute data among a plurality of nodes.

First claim

Opening claim text (preview).

What is claimed is: 1. A data processing system comprising: a first processor implementing a plurality of computational nodes, each node performing a corresponding operation in a data flow for data received at that node; and a second processor operating in parallel with the first processor, wherein the first processor and second processor are configured to: determine, via the first processor, a compression benefit metric pertaining to performance of the corresponding operations of one or more of said plurality of computational nodes in the data flow with recompressed data; recompress the data in a first compression scheme flowing between said plurality of computational nodes to a second compression scheme, via the second processor during running of said plurality of computational nodes on said first processor, for processing by the one or more computational nodes in the data flow based on the compression benefit metric indicating a benefit in processing performance gained by using the recompressed data for computational operations; and perform the corresponding operations of the one or more computational nodes using the recompressed data, via the first processor, to provide the benefit in processing performance. 2. The data processing system of claim 1 , wherein the compression benefit metric is determined using a measure of data compression preserved by one or more destination computational nodes. 3. The data processing system of claim 1 , wherein the compression benefit metric is determined using a measure of an estimated reduction in a size of the data resulting from recompression. 4. The data processing system of claim 1 , wherein the compression benefit metric is determined using a measure of an estimated computational benefit from a destination node performing an operation on recompressed data. 5. The data processing system of claim 1 , wherein the compression benefit metric is determined using a measure of a computational benefit of a destination node performing an operation on data in a particular compressed form. 6. The data processing system of claim 1 , wherein the first processor is configured to determine a priority for recompressing data based on the compression benefit metric, and the second processor is configured to recompress data according to the priority. 7. The data processing system of claim 1 , wherein one or more of said plurality of computational nodes process data without recompression in response to data awaiting recompression by the second processor and being unavailable for processing. 8. The data processing system of claim 1 , wherein the second processor is configured to: decompress compressed data; and compress the decompressed data and produce recompressed data. 9. The data processing system of claim 1 , wherein the second processor includes a Field Programmable Gate Array (FPGA) or an Application Specific Integrated Circuit (ASIC) to recompress the data. 10. The data processing system of claim 1 , further including a plurality of processors interconnected by a network to process data in parallel, wherein each processor is configured to: apply a distribution function to distribute data among the plurality of processors; compress data prior to transmission to the plurality of processors; and compress data received from the plurality of processors. 11. A method of processing data using a plurality of computational nodes, each node performing a corresponding operation in a data flow for data received at that node, wherein each node is implemented by a first processor and utilizes a second processor operating in parallel with the first processor, and said method comprising: determining, via the first processor, a compression benefit metric pertaining to performance of the corresponding operations of one or more of said plurality of computational nodes in the data flow with recompressed data; recompressing the data in a first compression scheme flowing between said plurality of computational nodes to a second compression scheme, via the second processor during running of said plurality of computational nodes on said first processor, for processing by the one or more computational nodes in the data flow based on the compression benefit metric indicating a benefit in processing performance gained by using the recompressed data for computational operations; and performing the corresponding operations of the one or more computational nodes using the recompressed data, via the first processor, to provide the benefit in processing performance. 12. The method of claim 11 , wherein the compression benefit metric is determined using one or more of the following: (a) a measure of data compression preserved by one or more destination computational nodes; (b) a measure of an estimated reduction in a size of the data resulting from recompression; (c) a measure of an estimated computational benefit from a destination node performing an operation on recompressed data; and (d) a measure of a computational benefit of a destination node performing an operation on data in a particular compressed form. 13. The method of claim 11 , wherein determining a compression benefit metric further determines a priority for recompressing data based on the compression benefit metric, and recompressing the data comprises recompressing the data according to the priority. 14. The method of claim 11 , wherein one or more of said plurality of computational nodes process data without recompression in response to data awaiting recompression by the second processor and being unavailable for processing. 15. The method of claim 11 , further comprising: decompressing compressed data via the second processor; and compressing the decompressed data and producing recompressed data via the second processor. 16. The method of claim 11 , wherein a plurality of processors interconnected by a network are utilized to process data in parallel, and said method further comprising: applying a distribution function to distribute data among the plurality of processors; compressing data prior to transmission to the plurality of processors; and compressing data received from the plurality of processors. 17. A computer program product for processing data using a plurality of computational nodes, each node performing a corresponding operation in a data flow for data received at that node and implemented by a first processor and utilizing a second processor operating in parallel with the first processor, wherein the computer program product comprises one or more computer readable storage media collectively having computer readable program code embodied therewith, the computer readable program code, when executed by the first processor and second processor, causing the first processor and second processor to: determine, via the first processor, a compression benefit metric pertaining to performance of the corresponding operations of one or more of said plurality of computational nodes in the data flow with recompressed data; recompress the data in a first compression scheme flowing between said plurality of computational nodes to a second compression scheme, via the second processor during running of said plurality of computational nodes on said first processor, for processing by the one or more computational nodes in the data flow based on the compression benefit metric indicating a benefit in processing performance gained by using the recompressed data for computational operations; and perform the corresponding operations of the one or more computational nodes using the recompressed dat

Assignees

Inventors

Classifications

  • Ensuring data consistency and integrity · CPC title

  • Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title

  • using compression, e.g. sparse files · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10831713B2 cover?
According to embodiments of the present invention, machines, systems, methods and computer program products for hardware acceleration are presented. A plurality of computational nodes for processing data is provided, each node performing a corresponding operation for data received at that node. A metric module is used to determine a compression benefit metric pertaining to performance of the co…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/1744. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 10 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).