Data records selection

US9892026B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9892026-B2
Application numberUS-201313827558-A
CountryUS
Kind codeB2
Filing dateMar 14, 2013
Priority dateFeb 1, 2013
Publication dateFeb 13, 2018
Grant dateFeb 13, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method includes accessing a plurality of data records, each data record having a plurality of data fields. The method further includes analyzing values for one or more of the data fields for at least some of the plurality of data records and generating a profile of the plurality of data records based on the analyzing. The method further includes formulating at least one subsetting rule based on the profile; and selecting a subset of data records from the plurality of data records based on the at least one subsetting rule.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for selecting data records to cause execution of a processing rule during testing of a data processing application, the method including: obtaining a first set of data records; processing the first set of data records using a data processing application that includes a processing rule, wherein a processing rule operates on at least one input value and generates at least one output value, and wherein whether the processing rule is executed by the data processing application during processing of a particular data record depends directly or indirectly on a value in each of one or more data fields of the particular data record; receiving execution information indicative of a number of times the processing rule was executed in connection with processing of the first set of data records; obtaining a second set of data records; analyzing values in one or more data fields of each of the data records in the second set, the analyzing including generating a profile of each of one or more of the data fields for the second set of data records, the profile of a data field characterizing the values in the data field; obtaining a subsetting rule based on (i) the generated profile and (ii) the execution information indicative of the number of times the processing rule was executed in connection with processing the first set of data records, the subsetting rule including an identification of a particular one of the data fields of the data records in the second set as a target data field; selecting a subset of data records from the second set of data records according to the subsetting rule, the selecting of the subset of data records being based on values in the target data field; and processing the selected subset of data records using the data processing application. 2. The method of claim 1 , wherein obtaining the subsetting rule includes formulating the subsetting rule, including identifying the one of the data fields as the target data field based on a cardinality of the identified one of the data fields. 3. The method of claim 2 , wherein the target data field has a set of distinct values in the plurality of data records, and wherein selecting a subset of data records includes selecting data records such that there is at least one data record in the selected subset that has each of the distinct values for the target data field. 4. The method of claim 1 , wherein generating a profile includes classifying values for a first data field of the data records in the second set of data records; and wherein obtaining the subsetting rule includes formulating the subsetting rule, including identifying the first data field as the target data field based on the classifying. 5. The method of claim 4 , wherein the target data field has a set of distinct values of the data records in the second set of data records, and wherein selecting a subset of data records includes selecting data records such that there is at least one data record in the selected subset that has each of the distinct values for the target data field. 6. The method of claim 1 , wherein the subsetting rule identifies a first data field as a first target data field and a second data field as a second target data field. 7. The method of claim 6 , wherein selecting a subset of data records includes selecting the subset of data records based on combinations of a first set of distinct values for the first target data field and a second set of distinct values for the second target data field. 8. The method of claim 1 , wherein generating a profile includes identifying a relationship between data records of the second set of data records related via values of a first data field; and wherein the at least one subsetting rule includes an identification of the relationship. 9. The method of claim 8 , wherein selecting a subset of data records includes: selecting a first data record; and selecting one or more second data records related to the first data record via the relationship identified in the subsetting rule. 10. The method of claim 8 , wherein the relationship between data records includes a relationship between data records in the second set of data records and data records in a third set of data records. 11. The method of claim 1 , wherein generating a profile includes: generating a pseudofield for at least some of the data records in the second set of data records; and populating the pseudofield for each corresponding data record with an accumulated value, wherein the accumulated value for a first data record is determined based on the first data record and at least one other data record related to the first data record, wherein the first data record and the at least one other data record are related via values of a first data field. 12. The method of claim 11 , including determining the accumulated value based on a sum of a value for a second data field of the first data record and values for the second data field for each other related data record. 13. The method of claim 1 , wherein obtaining the subsetting rule includes receiving the subsetting rule. 14. The method of claim 1 , including providing the selected subset of data records to the data processing application. 15. The method of claim 1 , including: formulating a second subsetting rule based on results of processing the selected subset of data records by the data processing application; and selecting a second subset of data records based on the second subsetting rule. 16. A non-transitory computer-readable medium storing instructions for causing a computing system to select data records to cause execution of a processing rule during testing of a data processing application, the instructions causing the computing system to: obtain a first set of data records; process the first set of data records using a data processing application that includes a processing rule, wherein a processing rule operates on at least one input value and generates at least one output value, and wherein whether the processing rule is executed by the data processing application during processing of a particular data record depends directly or indirectly on a value in each of one or more data fields of the particular data record; receive execution information indicative of a number of times the processing rule was executed in connection with processing of the first set of data records; obtain a second set of data records; analyze values in one or more data fields of each of the data records in the second set, the analyzing including generating a profile of each of one or more of the data fields for the second set of data records, the profile of a data field characterizing the values in the data field; obtain a subsetting rule based on (i) the generated profiles and (ii) the execution information indicative of the number of times the processing rule was executed in connection with processing the first set of data records, the subsetting rule including an identification of a particular one of the data fields of the data records in the second set as a target data field; select a subset of data records from the second set of data records according to the subsetting rule, the selecting of the subset of data records being based on values in the target data field; and process the selected subset of data records using the data processing application. 17. The non-transitory computer-readable medium of claim 16 , wherein obtaining the subsetting rule includes formulating the subsetting rule, including identifying the one of the data fi

Assignees

Inventors

Classifications

  • Database tuning (G06F16/2282 takes precedence; database performance monitoring G06F11/3409) · CPC title

  • Search customisation based on user profiles and personalisation · CPC title

  • for test design, e.g. generating new test cases · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9892026B2 cover?
A computer-implemented method includes accessing a plurality of data records, each data record having a plurality of data fields. The method further includes analyzing values for one or more of the data fields for at least some of the plurality of data records and generating a profile of the plurality of data records based on the analyzing. The method further includes formulating at least one s…
Who is the assignee on this patent?
Ab Initio Technology Llc
What technology area does this patent fall under?
Primary CPC classification G06F11/3684. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 13 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).