Maintaining staleness information for aggregate data

US9659039B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9659039-B2
Application numberUS-201314033380-A
CountryUS
Kind codeB2
Filing dateSep 20, 2013
Priority dateSep 20, 2013
Publication dateMay 23, 2017
Grant dateMay 23, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Computer systems, machine-implemented methods, and stored instructions are provided herein for maintaining information that describes aggregate characteristics of data within zones. Stored data may be separated into defined zone(s). Data structure(s), such as zone map(s), may store, for each of the zone(s), aggregate characteristic(s) of data in the zone, and a stored indication of whether or not the zone is stale. When a change is made to data in a particular zone that was not stale, a zone manager causes the particular zone to become stale if the change can result in the particular zone having data that is not included in the particular zone's stored aggregate characteristic(s). On the other hand, if the change cannot result in the particular zone having data that is not included in the particular zone's stored aggregate characteristic(s), then the zone manager does not cause the particular zone to become stale.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: storing one or more data structures comprising: for each zone of one or more zones, said each zone corresponding to a set of contiguous blocks of a physical storage location: a set of one or more aggregate characteristics comprising an respective aggregate value that defines a range of possible values in a corresponding set of data stored in said each zone, and a stored indication that the respective aggregate value for the corresponding set of data is guaranteed to be non-stale; in response to an operation that causes a change to a particular zone of the one or more zones, determining whether an operation type of the operation can affect accuracy of the set of one or more aggregate characteristics for the particular zone; when it is determined that the operation type can affect accuracy of the set of one or more aggregate characteristics for the particular zone, updating the stored indication of the particular zone to indicate that the particular zone is stale; wherein when it is determined that the operation type cannot affect accuracy of the set of one or more aggregate characteristics for the particular zone, the stored indication of the particular zone indicating that the particular zone is not stale is preserved; receiving a query on a set of data comprising data stored in the particular zone, wherein the query comprises a predicate; in response to determining that the stored indication for the particular zone indicates that the particular zone is guaranteed to be non-stale, pruning the particular zone from a scan of the corresponding set of contiguous blocks of the physical storage location based on the predicate and the respective aggregate value for the particular zone; and retrieving data comprising query results based on said pruning; wherein the method is performed by one or more computing devices. 2. The method of claim 1 , wherein the set of one or more aggregate characteristics for the particular zone includes at least one of: a guaranteed maximum value that could exist in a subset of data in the zone without guaranteeing that the guaranteed maximum value actually exists in the subset of data in the zone; and a guaranteed minimum value that could exist in a subset of data in the zone without guaranteeing that the guaranteed minimum value actually exists in the subset of data in the zone. 3. The method of claim 1 , wherein the set of one or more aggregate characteristics for the particular zone is stored in a main memory of a storage server. 4. The method of claim 1 , wherein it is determined that the operation type can affect accuracy of the set of one or more aggregate characteristics for the particular zone when the operation is an insert operation involving a specific table or an update operation involving a specific column of the specific table. 5. The method of claim 1 , wherein it is determined that the operation type cannot affect accuracy of the set of one or more aggregate characteristics for the particular zone when the operation is a delete operation. 6. The method of claim 1 , wherein each zone of the one or more zones is identified based at least in part on a size of the zone. 7. The method of claim 1 , wherein the one or more zones are of a size that is configurable on a command interface. 8. The method of claim 1 , further including receiving a data definition language command via a command interface to configure one or more zones to a particular size, said command interface evaluating commands conforming to a structured query language. 9. The method of claim 1 , further comprising re-calculating sets of aggregate characteristics of data in stale zones without re-calculating sets of aggregate characteristics of data in non-stale zones. 10. The method of claim 1 , further comprising: determining which particular zones of a plurality of zones are stale; determining which particular storage locations of a plurality of storage locations correspond to the particular zones; and accessing the particular storage locations to re-calculate sets of aggregate characteristics of data for the particular zones without accessing other storage locations of the plurality of storage locations that correspond to non-stale zones of the plurality of zones. 11. The method of claim 1 , further comprising periodically recalculating sets of aggregate characteristics of data. 12. The method of claim 1 , further comprising: determining, based on the stored indication for the one or more zones, a set of excluded zones that are guaranteed to be non-stale and that are outside one or more parameters of the query, wherein contiguous storage locations corresponding to zones in the set of excluded zones are not accessed when executing the query. 13. The method of claim 1 , wherein, for said each zone of the one or more zones, the respective set of one or more aggregate characteristics defines a range of possible values in the zone, wherein it is determined that the operation type can affect accuracy of the set of one or more aggregate characteristics for the particular zone when the operation type can increase the range of possible values. 14. The method of claim 1 , wherein, for said each zone of the one or more zones, the set of one or more aggregate characteristics is at least one of a sum, a mean, a median, a mode, or a count, wherein it is determined that the operation type can affect accuracy of the set of one or more aggregate characteristics of the particular zone when the operation type can change the set of one or more aggregate characteristics of the particular zone. 15. The method of claim 1 , wherein the one or more zones are defined to be of a different size than one or more partitions that store data for the one or more zones, and wherein the one or more zones are defined to be of a different size than one or more extents that store data for the one or more zones, further comprising dividing up a particular partition into a plurality of zones, wherein the one or more zones comprise the plurality of zones. 16. The method of claim 1 , wherein the one or more zones comprise two or more sub-zones and one or more super-zones, wherein at least one of the one or more super-zones is defined to include at least two sub-zones. 17. The method of claim 1 , wherein the one or more data structures further comprise, for each zone of the one or more zones, a number of rows in the zone. 18. The method of claim 1 , wherein at least one zone of the one or more zones is defined on joined tables; wherein the one or more data structures further comprise a number of anti-joined rows in the at least one zone; further comprising using the number of anti-joined rows in the at least one zone to determine whether referential integrity holds for the at least one zone. 19. The method of claim 1 , further comprising pruning the particular zone when the stored indication guarantees that no values in the zone can satisfy the predicate, regardless of whether the respective aggregate value is actually in the particular zone. 20. One or more non-transitory computer-readable media storing sequences of instructions, wherein the sequences of instructions, when executed by one or more hardware processors, cause: storing one or more data structures comprising: for each zone of one or more zones, said each zone corresponding to a set of contiguous blocks of a physical storage location: a set of one or more aggregate characteristics comprising an respective aggregate value

Assignees

Inventors

Classifications

  • G06F16/21Primary

    Design, administration or maintenance of databases · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9659039B2 cover?
Computer systems, machine-implemented methods, and stored instructions are provided herein for maintaining information that describes aggregate characteristics of data within zones. Stored data may be separated into defined zone(s). Data structure(s), such as zone map(s), may store, for each of the zone(s), aggregate characteristic(s) of data in the zone, and a stored indication of whether or n…
Who is the assignee on this patent?
Oracle Int Corp
What technology area does this patent fall under?
Primary CPC classification G06F16/21. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 23 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).