Data caching in hybrid data processing and integration environment

US10169429B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10169429-B2
Application numberUS-201514938474-A
CountryUS
Kind codeB2
Filing dateNov 11, 2015
Priority dateNov 11, 2015
Publication dateJan 1, 2019
Grant dateJan 1, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An integrated data processing system with two-tier data caching system and techniques for use thereof in a hybrid RDBMS and BDS computing environment are provided. In one aspect, the system is RDBMS-centric and uses two caches, one on the RDBMS side (1st tier) and the other on the BDS side (2nd tier). In another aspect, a DRDA wrapper on the BDS side enables the RDBMS to communicate with the BDS as if the BDS is another RDBMS. This is advantageous because the RDBMS already supports the DRDA protocol standard. In yet another aspect, the DRDA wrapper performs the data transformation needed when transferring cached objects between the RDBMS cache and BDS cache because RDBMS and BDS save data objects in different formats. This is advantageous because it offloads the computation from RDBMS to BDS therefore reducing the performance impact on RDBMS for its normal query and transaction processing.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for processing a query involving data objects both in a Relational Database Management System (RDBMS) and in a Big Data System (BDS), the method comprising the steps of: parsing the query into requests for RDBMS data objects and BDS data objects, wherein the RDBMS hosts structured data and the BDS hosts structured, semi-structured and unstructured data, and wherein the query involves joining the data objects in the RDBMS and the BDS; determining whether the RDBMS data objects are present in a RDBMS cache; retrieving the RDBMS data objects from the RDBMS cache if the RDBMS data objects are present in the RDBMS cache, otherwise determining whether the RDBMS data objects are present in a BDS cache; retrieving the RDBMS data objects from the BDS cache if the RDBMS data objects are present in the BDS cache, otherwise computing the RDBMS data objects; determining whether the BDS data objects are present in the RDBMS cache; retrieving the BDS data objects from the RDBMS cache if the BDS data objects are present in the RDBMS cache, otherwise retrieving the BDS data objects: from the BDS cache if the BDS data objects are present in the BDS cache, or as computed by the BDS if the BDS data objects are not present in the BDS cache. 2. The method of claim 1 , wherein the RDBMS cache is a host cache, and the BDS cache is a distributed cache. 3. The method of claim 1 , further comprising the steps of: storing the computed RDBMS data objects in the RDBMS cache. 4. The method of claim 1 , further comprising the step of: storing the computed BDS data object in both the RDBMS cache and the BDS cache. 5. The method of claim 1 , further comprising the step of: transforming the computed BDS data objects into a RDBMS-compatible form. 6. The method of claim 5 , wherein the RDBMS data objects are computed by a RDBMS engine and the BDS data objects are computed by a BDS engine, and wherein the BDS engine performs the step of transforming the computed BDS data objects into a RDBMS-compatible form. 7. The method of claim 6 , wherein the BDS engine is Distributed Relational Database Architecture (DRDA)-enabled. 8. The method of claim 1 , further comprising the step of: joining the RDBMS data objects and the BDS data objects. 9. The method of claim 1 , further comprising the steps of: computing a score for each of the RDBMS data objects in the RDBMS cache; selecting a RDBMS data object in the RDBMS cache with a highest score; determining whether a cost to compute the selected RDBMS data object is greater than a cost to transfer the selected RDBMS data object from the BDS cache; and transferring the selected RDBMS data object from the BDS cache if the cost to compute the selected RDBMS data object is greater than the cost to transfer the selected RDBMS data object from the BDS cache, otherwise evicting the selected RDBMS data object from the RDBMS cache. 10. The method of claim 9 , further comprising the step of: determining whether there is enough room in the RDBMS cache after performing the transferring or evicting steps. 11. The method of claim 10 , further comprising the step of: repeating the steps of claim 9 with a data object in the RDBMS cache having a next highest score if it is determined that there is not enough room in the RDBMS cache after performing the transferring or evicting steps. 12. The method of claim 1 , further comprising the steps of: computing a score for each of the BDS data objects in the BDS cache; selecting a BDS data object in the BDS cache with a highest score; evicting the selected BDS data object from the BDS cache; and evicting the selected BDS data object from the RDBMS cache. 13. The method of claim 12 , further comprising the step of: determining whether there is enough room in the BDS cache after performing the evicting steps. 14. The method of claim 13 , further comprising the step of: repeating the steps of claim 12 with a data object in the BDS cache having a next highest score if it is determined that there is not enough room in the BDS cache after performing the evicting steps. 15. The method of claim 1 , further comprising the steps of: determining if underlying data for computing a given one or more of the RDBMS data objects in the RDBMS cache has changed; and refreshing the given one or more RDBMS data objects in the RDBMS cache if the underlying data have changed. 16. The method of claim 15 , wherein the given one or more RDBMS data objects in the RDBMS cache have been refreshed, the method further comprising the steps of: determining if the given one or more RDBMS data objects have been pushed out to the BDS cache; and refreshing the given one or more RDBMS data objects in the BDS cache if the given one or more RDBMS data objects have been pushed out to the BDS cache. 17. The method of claim 1 , further comprising the steps of: determining if underlying data for computing a given one or more of the BDS data objects in the BDS cache has changed; refreshing the given one or more BDS data objects in the BDS cache if the underlying data have changed; and refreshing the given one or more BDS data objects in the RDBMS cache if the underlying data have changed. 18. A computer program product for processing a query involving data objects both in a RDBMS and in a BDS, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a computer to cause the computer to: parse the query into requests for RDBMS data objects and BDS data objects, wherein the RDBMS hosts structured data and the BDS hosts structured, semi-structured and unstructured data, and wherein the query involves joining the data objects in the RDBMS and the BDS; determine whether the RDBMS data objects are present in a RDBMS cache; retrieve the RDBMS data objects from the RDBMS cache if the RDBMS data objects are present in the RDBMS cache otherwise determine whether the RDBMS data objects are present in a BDS cache; retrieve the RDBMS data objects from the BDS cache if the RDBMS data objects are present in the BDS cache, otherwise compute the RDBMS data objects; determine whether the BDS data objects are present in the RDBMS cache; retrieve the BDS data objects from the RDBMS cache if the BDS data objects are present in the RDBMS cache, otherwise retrieve the BDS data objects: from the BDS cache if the BDS data objects are present in the BDS cache, or as computed by the BDS if the BDS data objects are not present in the BDS cache. 19. The computer program product of claim 18 , wherein the RDBMS cache is a host cache, and the BDS cache is a distributed cache. 20. An integrated data processing (IDP) system, comprising: a RDBMS engine comprising a host RDBMS cache; and a BDS engine comprising a distributed BDS cache, wherein the RDBMS engine is configured via a processor, coupled to a memory, to parse a query into requests for RDBMS data objects and BDS data objects, wherein the RDBMS hosts structured data and the BDS hosts structured, semi-structured and unstructured data, and wherein the query involves joining data objects in the RDBMS and the BDS, determine whether the RDBMS data objects are present in a RDBMS cache, retrieve the RDBMS data objects from the RDBMS cache if the RDBMS data objects are present in the RDBMS cache, otherwise determine whether the RDBMS data objects are present in a BDS cache, retrieve the RDBMS data objects from the BDS cac

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10169429B2 cover?
An integrated data processing system with two-tier data caching system and techniques for use thereof in a hybrid RDBMS and BDS computing environment are provided. In one aspect, the system is RDBMS-centric and uses two caches, one on the RDBMS side (1st tier) and the other on the BDS side (2nd tier). In another aspect, a DRDA wrapper on the BDS side enables the RDBMS to communicate with the BD…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/2471. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 01 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).