Who is the assignee on this patent?

Microsoft Technology Licensing Llc

What technology area does this patent fall under?

Primary CPC classification G06V10/776. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Oct 07 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Adapting learned cardinality estimators to data and workload drifts

US12437522B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12437522-B2
Application number	US-202117566996-A
Country	US
Kind code	B2
Filing date	Dec 31, 2021
Priority date	Dec 31, 2021
Publication date	Oct 7, 2025
Grant date	Oct 7, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method of updating a trained cardinality estimation model includes receiving a cardinality estimation model with cardinality labels and detecting a drift in underlying data or predicates of the cardinality estimation model. The type of the detected drift is determined and new test queries that mimic test queries for the detected drift are synthesized. A portion of the synthesized test queries is selected to reduce annotation cost and used to update the cardinality estimation model.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of updating a trained cardinality estimation model implement in a computing system, the method comprising: receiving a cardinality estimation model with training predicates and cardinality labels; detecting a drift in underlying data or predicates of the cardinality estimation model; determining a type of the detected drift; based on the type of the detected drift, synthesizing new test queries that mimic test queries for the detected drift; selecting a portion of the new or synthesized test queries to annotate with cardinality labels so as to reduce annotation cost; and updating the cardinality estimation model with newer predicates and cardinality labels. 2. The method of claim 1 , wherein the detecting is performed periodically. 3. The method of claim 1 , wherein the detecting is performed when an evaluation error of the cardinality estimation model on the test queries exceeds a threshold beyond the error observed during training. 4. The method of claim 1 , wherein the determining the type of the detected drift comprises counting a fraction of rows that are new or have changed since the cardinality estimation model was last trained and measuring a change in ground truth cardinality for one or more canary predicates. 5. The method of claim 1 , wherein the determining the type of the detected drift comprises determining that the number of new queries available is below the number of annotated queries necessary to train the cardinality estimation model or when an insufficient number of queries have ground truth labels. 6. The method of claim 1 , further comprising: injecting newly arrived predicates into a query pool; computing and using embeddings for the query predicates; updating a generator and discriminator if synthetic queries are needed; and updating the embeddings. 7. The method of claim 1 , further comprising determining a plurality of types of the drifts. 8. The method of claim 6 , further comprising using learned embeddings of query predicates to decouple adaptation components from featurizations used by the cardinality estimation model. 9. The method of claim 6 , further comprising synthesizing new query predicates using predicate embeddings in the query pool. 10. The method of claim 9 , further comprising receiving a predicate embedding as input and predicting whether a given predicate resembles a training, test, or generated workload. 11. A computing system, comprising: one or more processors; and a computer-readable storage medium having computer-executable instructions stored thereupon which, when executed by the processor, cause the computing system to perform operations comprising: detecting a drift in underlying data or predicates of a cardinality estimation model; determining a type of the detected drift; based on the type of the detected drift, synthesizing new test queries that mimic test queries for the detected drift; selecting a portion of the new or synthesized test queries to annotate with cardinality labels so as to reduce annotation cost; and outputting newer predicates and cardinality labels for updating the cardinality estimation model. 12. The computing system of claim 11 , wherein the determining the type of the detected drift comprises counting a fraction of rows that are new or have changed since the cardinality estimation model was last trained, and measuring a change in ground truth cardinality for one or more canary predicates. 13. The computing system of claim 11 , wherein the determining the type of the drift comprises determining that the number of new queries available is below the number of annotated queries necessary to train the cardinality estimation model or when an insufficient number of queries have ground truth labels. 14. A computer-readable storage medium having computer-executable instructions stored thereupon which, when executed by one or more processors of a computing device, cause the computing device to perform operations comprising: detecting a drift in underlying data or predicates of a cardinality estimation model; determining a type of the detected drift; based on the type of the detected drift, synthesizing new test queries that mimic test queries for the detected drift; selecting a portion of the new or synthesized test queries to annotate with cardinality labels so as to reduce annotation cost; and outputting newer predicates and cardinality labels for updating the cardinality estimation model. 15. The computer-readable storage medium of claim 14 , further comprising computer-executable instructions stored thereupon which, when executed by one or more processors of a computing device, cause the computing device to perform operations comprising: injecting newly arrived predicates into a query pool; computing embeddings for the newly arrived predicates; updating a generator and discriminator if synthetic queries are needed; and updating the embeddings. 16. The computer-readable storage medium of claim 15 , further comprising computer-executable instructions stored thereupon which, when executed by one or more processors of a computing device, cause the computing device to perform operations comprising: using learned embeddings of query predicates to decouple components from featurizations used by the cardinality estimation model. 17. The computer-readable storage medium of claim 15 , further comprising computer-executable instructions stored thereupon which, when executed by one or more processors of a computing device, cause the computing device to perform operations comprising: synthesizing new query predicates using predicate embeddings in the query pool. 18. The computer-readable storage medium of claim 15 , further comprising computer-executable instructions stored thereupon which, when executed by one or more processors of a computing device, cause the computing device to perform operations comprising: receiving a predicate embedding as input and predicting whether a given predicate resembles a training, test, or generated workload. 19. The computer-readable storage medium of claim 14 , wherein the detecting is performed periodically. 20. The computer-readable storage medium of claim 14 , wherein the detecting is performed when an evaluation error of the cardinality estimation model on the test queries exceeds a threshold beyond the error observed during training.

Assignees

Microsoft Technology Licensing Llc

Inventors

Classifications

G06N3/045
Combinations of networks · CPC title
G06V10/7747
Organisation of the process, e.g. bagging or boosting · CPC title
G06N3/0455
Auto-encoder networks; Encoder-decoder networks · CPC title
G06N3/094
Adversarial learning · CPC title
G06N3/0475
Generative networks · CPC title

Patent family

Related publications grouped by family.

View patent family 85017915

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12437522B2 cover?: A method of updating a trained cardinality estimation model includes receiving a cardinality estimation model with cardinality labels and detecting a drift in underlying data or predicates of the cardinality estimation model. The type of the detected drift is determined and new test queries that mimic test queries for the detected drift are synthesized. A portion of the synthesized test queries…
Who is the assignee on this patent?: Microsoft Technology Licensing Llc
What technology area does this patent fall under?: Primary CPC classification G06V10/776. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Oct 07 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).