Multi-entity normalization

US9613070B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9613070-B2
Application numberUS-201313842072-A
CountryUS
Kind codeB2
Filing dateMar 15, 2013
Priority dateMar 15, 2013
Publication dateApr 4, 2017
Grant dateApr 4, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In accordance with aspects of the disclosure, systems and methods are provided for normalizing data representing entities and relationships linking the entities including defining one or more graph rules describing searchable characteristics for the data representing the entities and relationships linking the entities, applying the one or more graph rules to the data representing the entities and the relationships linking the entities, identifying one or more matching instances between the one or more graph rules and the data representing the entities and the relationships linking the entities, and performing one or more actions to update the one or more matching instances between the one or more graph rules and the data representing the entities and the relationships linking the entities.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, comprising: retrieving, from a plurality of data providers, data representing entities and relationships linking the entities; defining, a plurality of graph rules defined in a graph model, the graph rules including one or more graph patterns associated with the graph model, the graph model specifying how the data is expected to be patterned and interrelated; searching the data representing entities and relationships linking the entities using the one or more graph patterns associated with the graph model, the searching including comparing the one or more graph patterns to a structure associated with the data; and in response to determining that the structure associated with the data matches at least one of the one or more graph patterns associated with the graph model, applying, in a continuous mode, the one or more graph rules by selecting an anchor entity and performing a graph walk from the anchor entity to determine violations to one or more of the graph rules, and correcting the data and the structure of the data, the correcting including updating the data representing the entities and relationships linking the entities that violate one or more of the plurality of graph rules. 2. The method of claim 1 , further comprising automatically correcting one or more attributes associated with the data based on correcting the structure of the data. 3. The method of claim 1 , further comprising validating the structure of the data using graph-based pattern matching across multiple entities in the data. 4. The method of claim 3 , wherein the graph-based pattern matching includes using a generic pattern that applies to data values of applications associated with the entities and relationships linking the entities to correct the data. 5. The method of claim 1 , wherein correcting the data and the structure of the data includes modifying the structure to adhere to a valid pattern defined in the graph rules and correcting the data to be modeled to the valid pattern. 6. The method of claim 1 , further comprising: detecting that the structure is incomplete; generating and presenting, to a user, a notification about the incomplete structure; and completing the structure based on at least one graph pattern in the plurality of graph rules. 7. A computer system including instructions recorded on a non-transitory computer-readable medium and executable by at least one processor, the system comprising, a normalization engine configured to cause the at least one processor to validate and clean data representing entities and relationships linking the entities, the normalization engine including: a rule definition module configured to define a plurality of graph rules describing searchable characteristics for the data representing the entities and the relationships linking the entities, the searchable characteristics defining one or more patterns that specify how entities and relationships linking the entities are expected to be modeled and interrelated; a rule application module configured to apply, in a continuous mode, the plurality of graph rules to ensure one or more sets of entities and the relationships linking the entities satisfy the plurality of graph rules including comparing the one or more patterns to a structure associated with the data representing entities and relationships linking the entities to determine whether the one or more patterns match the structure associated with the data representing the entities and relationships linking the entities; and a rule action module configured to perform one or more actions on the data and on the structure associated with the one or more sets of entities and the relationships linking the entities, the actions being performed to update the data representing the entities and relationships linking the entities that violate one or more of the plurality of graph rules, the update including cleaning the data by performing at least one of modifying one or more relationships linking the entities, wherein the update is performed in response to determining that the structure matches at least one of the one or more patterns. 8. The system of claim 7 , wherein the one or more actions further comprise automatically completing the incomplete structure. 9. The system of claim 7 , wherein determining whether the structure associated with the data matches at least one of the one or more patterns further includes matching a pattern associated with the graph rules to the at least one structure; and in response to determining that the at least one structure does not match the pattern, generating and presenting, to a user, a notification regarding a violation of one or more graph rules. 10. The system of claim 7 , wherein the one or more graph rules describing searchable characteristics include describing at least one semantic property related to the data representing the entities and the relationships linking the entities. 11. The system of claim 7 , wherein the one or more graph rules describing searchable characteristics include specifying at least one of an inclusion dependency and an exclusion dependency related to the data representing the entities and the relationships linking the entities. 12. The system of claim 7 , wherein the actions include automatically correcting the structure and the one or more sets of entities and the relationships linking the entities. 13. The system of claim 7 , wherein: the rule application module is further configured to identify one or more discrepancies between the one or more graph rules and the structure associated with the entities and the relationships linking the entities, detect that at least one of the entities and the relationships linking the entities is incomplete; and the rule action module is further configured to perform one or more actions to mitigate the one or more discrepancies, the one or more actions including generating and presenting, to a user, a notification about the incomplete entities and the relationships linking the entities. 14. The system of claim 13 , wherein identifying one or more discrepancies includes at least one of flagging and logging the one or more discrepancies as exceptions for analysis. 15. The system of claim 7 , wherein performing the one or more actions to update the one or more sets of entities and the relationships linking the entities includes modifying one or more entities, generating one or more additional entities, and generating one or more additional relationships linking the entities. 16. A computer program product, the computer program product being tangibly embodied on a non-transitory computer-readable medium and comprising instructions that, when executed by at least one processor, are configured to: retrieving, from a plurality of data providers, data representing entities and relationships linking the entities; defining, a plurality of graph rules defined in a graph model, the graph rules including one or more graph patterns associated with the graph model, the graph model specifying how the data is expected to be patterned and interrelated; searching the data representing entities and relationships linking the entities using the one or more graph patterns associated with the graph model, the searching including comparing the one or more graph patterns to a structure associated with the data; and in response to determining that the structure associated with the data matches at least one of the one or more graph patterns associated with the graph model, applying, in a continuous mode, the one or more graph rules by

Assignees

Inventors

Classifications

  • Information retrieval; Database structures therefor; File system structures therefor · CPC title

  • Graphs; Linked lists (G06F16/9027 takes precedence) · CPC title

  • G06F16/215Primary

    Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title

  • Updating · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9613070B2 cover?
In accordance with aspects of the disclosure, systems and methods are provided for normalizing data representing entities and relationships linking the entities including defining one or more graph rules describing searchable characteristics for the data representing the entities and relationships linking the entities, applying the one or more graph rules to the data representing the entities a…
Who is the assignee on this patent?
Bmc Software Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/215. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 04 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).