Automated creation of optimized persistent database views

US12189615B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12189615-B2
Application numberUS-202217970200-A
CountryUS
Kind codeB2
Filing dateOct 20, 2022
Priority dateOct 20, 2022
Publication dateJan 7, 2025
Grant dateJan 7, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments for automatically optimizing and persisting database views by receiving queries made to a database, wherein each query generates a respective database view, and generating a set of database maintained views generated by the queries. The system obtains, for each generated view, certain telemetry information about a respective view including latency, memory space utilization, and processor utilization, among other factors. It then scores each view of the generated views based on an base score modified by the obtained information to determine which one or more of the generated views to make persistent, and maintains the one or more persistent views to produce an optimized persistent set of database views. It further adapts later queries to use the optimized persistent views.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method of optimizing and persisting database views in a large-scale data processing system, comprising: receiving queries made to a database through a processor-based search engine, wherein each query generates a respective database view; tokenizing each query into respective constituent parts, each of which form nodes and edges in a heterogeneous graph such that common parts of queries are used as a materialized view; generating a set of database maintained views generated by the queries, wherein each view comprises a dynamically generated virtual table holding data generated by queries applied to one or more tables in the database; generating, for each query, a graph representation of the query that as a query graph that shows commonalities with other queries through the common parts indicated by the materialized view; obtaining, for each generated view, telemetry information about a respective view including latency, memory space utilization, and processor utilization; calculating a base score for each generated view using a scoring function based on a minimization objective combining as a sum, the processor utilization plus a query time for each respective database view plus the memory space utilization; scoring, by a processor-based component, each view of the generated views based on the base score modified by the obtained information to determine which one or more of the generated views to make persistent; maintaining, through a computer graphical user interface (GUI), the one or more persistent views to produce a set of optimized persistent database views; and using a reconstructive self-supervised (SSL) model of graph neural networks (GNN) to generate a database view of other queries using the common parts to produce the set of optimized persistent database views to thereby save resources when processing queries by leveraging the commonalities. 2. The method of claim 1 wherein the base score is modified by: increasing the base score for any savings of time or any savings of CPU usage for a corresponding view; decreasing the base score in relation to an amount of increased memory space the corresponding view requires; and decreasing the base score if a corresponding query does not conform to a service level agreement. 3. The method of claim 2 wherein the set of optimized persistent views is produced by at least one of adding or removing a view from the generated views, or adding or removing a column from a view of the generated views, and further wherein the method comprises masking part of the query graph and training a GNN to reconstruct it to generate the set of optimized persistent database views. 4. The method of claim 1 wherein each query of the queries comprises data elements including base objects including tables, data columns, data joins, filters, and overlay functions. 5. The method of claim 1 further comprising processing the queries through an artificial intelligence/machine learning (AI/ML) process that utilizes historical query data and metadata to identify data sources and connections the data sources to return recommendations used to produce the set of optimized persistent database views. 6. The method of claim 5 wherein the graph representation comprises the data elements represented as nodes and connected to one or more other nodes through edges classified by relationship between any two connected nodes. 7. The method of claim 6 further comprising drawing the graph representation by: taking all tables from a ‘from’ sub-statement; creating a node for each column in a table and adding an edge between nodes that are from a same table; adding an edge signifying a ‘join’ between relevant column nodes for each join in the ‘from’ sub-statement; and signifying, from a ‘select’ sub-statement all selected columns as selected in a corresponding node's feature vector. 8. The method of claim 4 wherein the database comprises a Structured Query Language (SQL) database, and wherein the data processing system is maintained by a large scale enterprise, and wherein the database stores Big Data-scale data sets and is stored in storage devices organized into arrays. 9. The method of claim 8 wherein the data processing system comprises a search engine processing the queries from the target user, the search engine returning one or more data asset recommendations responsive to the queries and the adapted later queries. 10. The method of claim 1 wherein the obtained information further comprises frequency of the query, query keys, and service level requirements, and wherein the latency comprises a time required to return results for the query. 11. The method of claim 4 wherein the maintaining step executes an optimization method comprising a genetic algorithm (GA) that iteratively uses genetic operators to create a set of optimized views using chromosome representation, selection, crossover, mutation, and fitness function computation. 12. A method of processing queries input to a data retrieval system storing a database for access by users in an enterprise, comprising: storing, by a processor-based component, query information for a plurality of queries in a query database, each query input through a processor-based search engine and generating a respective view of the database to form a plurality of generated views, wherein each query is tokenized into respective constituent parts, each of which form nodes and edges in a heterogeneous graph such that common parts of queries are used as a materialized view; calculating a base score for each generated view using a scoring function based on a minimization objective combining as a sum of processor utilization plus a query time for each generated view plus a memory space utilization of each generated view; generating, for each query, a graph representation of the query that as a query graph that shows commonalities with other queries through the common parts indicated by the materialized view; generating, in a view configuration optimizer, a set of optimized views for display through a computer graphical user interface (GUI) of the database for the queries, the optimizer performing at least one of: of adding or removing a view from the maintained views, or adding or removing a column from a view of the maintained views, based on an optimized score from the base score for each generated view; and using a reconstructive self-supervised (SSL) model of graph neural networks (GNN) to generate a database view of other queries using the common parts to produce the set of optimized views; and inputting, through a computer interface, the optimized views into a query adapter of the search engine that takes the plurality of queries and generates a set of adapted queries corresponding to the optimized views to thereby save resources when processing queries by leveraging the common parts of the queries. 13. The method of claim 12 wherein the optimizing step comprises modifying the base score by: increasing the base score for any savings of time or any savings of CPU usage for a corresponding view; decreasing the base score in relation to an amount of increased memory space the corresponding view requires; and decreasing the base score if a corresponding query does not conform to a service level agreement. 14. The method of claim 13 wherein the minimization objective is based on obtained information comprising frequency of the query, query keys, time required to return results for the query, CPU usage to process the query, and memory space consumed by each view generated by the query. 15. The method of

Assignees

Inventors

Classifications

  • for performance assessment · CPC title

  • Unary operations; Data partitioning operations · CPC title

  • Updating materialised views · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12189615B2 cover?
Embodiments for automatically optimizing and persisting database views by receiving queries made to a database, wherein each query generates a respective database view, and generating a set of database maintained views generated by the queries. The system obtains, for each generated view, certain telemetry information about a respective view including latency, memory space utilization, and proc…
Who is the assignee on this patent?
Dell Products Lp
What technology area does this patent fall under?
Primary CPC classification G06F16/2393. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 07 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).