Data record auditing systems and methods

US9710859B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9710859-B1
Application numberUS-201313928234-A
CountryUS
Kind codeB1
Filing dateJun 26, 2013
Priority dateJun 26, 2013
Publication dateJul 18, 2017
Grant dateJul 18, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods are presented for processing and auditing data records using a stream based data processing system. Data output by data center computers may be collected and used to generate data records that include values for metrics related to computer resource consumption. These data records may be inserted into a stream which can include auditors and various other processors. The auditors may determine whether any of the data records include discrepancies. A gating processor can determine which processors, if any, to provide data records that include discrepancies. Further, an amendment processor can be used to resolve discrepancies detected by the auditors. In addition, a billing processor can be used to generate bills that identify the discrepancies and include information relating to the cause and actions taken in response to the discrepancies detected in the data records.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of auditing data records, the method comprising: under control of an auditor comprising one or more processors configured to access data records from a stream of data records, the stream of data records including data corresponding to a measure of usage of computing resources within a program execution service: identifying a first set of data records from the stream of data records, the first set of data records obtained from a first data source via a communication channel between the first data source and the auditor within the program execution service, the program execution service distributed among a plurality of servers; accessing a second set of data records from the stream of data records, the second set of data records at least partially independent from the first set of data records; auditing the first set of data records based, at least in part, on the second set of data records by determining whether the first set of data records conforms to the second set of data records within a threshold degree of error; in response to determining that the first set of data records does not conform to the second set of data records within the threshold degree of error: determining a discrepancy between the first set of data records and the second set of data records; determining whether the discrepancy can be resolved without user involvement based, at least in part, on a type of the discrepancy; in response to determining that the discrepancy can be resolved without user involvement, amending the first set of data records to resolve the discrepancy, wherein amending the first set of data records includes generating a new data record that in combination with the first set of data records resolves the discrepancy; and in response to determining that the discrepancy cannot be resolved without user involvement, using a distributed gating processor to prevent access by a subsequent processor to a subset of the first set of data records corresponding to the discrepancy reducing the quantity of data supplied to the subsequent processor for processing; and generating a user interface that presents a usage report to a user, the usage report generated based at least in part on the first set of data records, wherein the usage report excludes data corresponding to the subset of the first set of data records when it is determined that the discrepancy cannot be resolved without user involvement. 2. The method of claim 1 , wherein the second set of data records is obtained from a second data source. 3. The method of claim 2 , wherein the method further comprises: determining a statistical relationship between the first data source and the second data source; and generating an expected set of data records based, at least in part, on the statistical relationship and the second set of data records, wherein determining whether the first set of data records conforms to the second set of data records within the threshold degree of error comprises determining whether the first set of data records matches the expected set of data records within the threshold degree of error. 4. The method of claim 3 , wherein determining the statistical relationship between the first data source and the second data source comprises determining a statistical relationship between a metric included in data records from the first data source and a corresponding metric included in data records from the second data source. 5. The method of claim 1 , wherein the first set of data records and the second set of data records are of different types. 6. The method of claim 1 , wherein the first set of data records and the second set of data records are associated with the consumption of the same set of computing resources and are generated by different systems. 7. The method of claim 1 , wherein the first set of data records and the second set of data records are collected over a first time period and are batched together for auditing at a second time period. 8. The method of claim 1 , wherein the second set of data records comprises data records obtained from the first data source that were generated during an earlier time period than a time period during which the first set of data records were generated. 9. The method of claim 8 , wherein the method further comprises: calculating a statistical trend for at least one metric included in each data record of the second set of data records; and generating an expected set of data records for the time period during which the first set of data records were generated based, at least in part, on the statistical trend, wherein determining whether the first set of data records conforms to the second set of data records within the threshold degree of error comprises determining whether the values for the at least one metric in the first set of data records matches the values for the at least one metric in the expected set of data records within the threshold degree of error. 10. The method of claim 1 , further comprising alerting a user to the discrepancy in response to determining that the discrepancy cannot be resolved without user involvement. 11. The method of claim 1 , wherein determining whether the discrepancy can be resolved comprises determining whether a probability that the discrepancy can be resolved satisfies a certainty threshold. 12. A system, the system comprising: one or more computer systems configured to effect a stream based data processing system, the stream based data processing system configured to: access a set of one or more data records from a stream of data records, the set of one or more data records obtained from a first data source via a communication channel between the first data source and an auditor and comprising a time series of data records, the stream of data records generated by a set of computing systems configured to provide multi-tenant network services to users, the set of one or more data records including data corresponding to a measure of usage of computing resources of the set of computing systems, the set of computing systems implementing a program execution service distributed among the set of computing systems; access a set of data from a second data source other than the stream of data records; and determine whether at least one data record from the set of one or more data records includes a discrepancy based, at least in part, on the set of data; the stream based data processing system including an amendment system comprising one or more processors, the amendment system configured to conditionally resolve a discrepancy in at least one data record from the set of one or more data records in response to determining that the at least one data record from the set of one or more data records includes the discrepancy and that the discrepancy can be resolved automatically, wherein conditionally resolving the discrepancy comprises generating a new data record that in combination with the one or more data records resolves the discrepancy; a distributed gating processor configured to prevent access by one or more additional processors to the at least one data record reducing the quantity of data supplied to the subsequent processor for processing in response to determining that the discrepancy cannot be resolved automatically; and a user interface system configured to generate a user interface that presents a usage report to a user, the usage report generated based at least in part on the set of one or more data records, wherein the usage report excludes data corresponding to the at least one data record when it is determined that the discrepancy cannot be resolved automatically.

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9710859B1 cover?
Systems and methods are presented for processing and auditing data records using a stream based data processing system. Data output by data center computers may be collected and used to generate data records that include values for metrics related to computer resource consumption. These data records may be inserted into a stream which can include auditors and various other processors. The audit…
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06Q40/10. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 18 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).