Activity information schema discovery and schema change detection and notification
US-2016042015-A1 · Feb 11, 2016 · US
US10691655B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10691655-B2 |
| Application number | US-201615299312-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 20, 2016 |
| Priority date | Oct 20, 2016 |
| Publication date | Jun 23, 2020 |
| Grant date | Jun 23, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Various technologies pertaining to extracting data encoded in a tree-structured document and generating a table based upon the extracted data are described herein. In a first embodiment, the table is generated without requiring input from a data cleaner. In a second embodiment, the table is generated based upon examples set forth by a data cleaner.
Opening claim text (preview).
What is claimed is: 1. A computing system comprising: at least one processor; and memory that stores a data cleaning tool, wherein the data cleaning tool, when executed by the at least one processor, is configured to: load a tree-structured document into the memory; receive a request to generate tabular data based upon the tree-structured document; responsive to receiving the request, select a conversion scheme from amongst a plurality of potential conversion schemes, the selected conversion scheme is configured to generate the tabular data when the tree-structured document is received as input to the conversion scheme, wherein the conversion scheme is selected from amongst the plurality of potential conversion schemes based upon historic structure of tabular data in an enterprise division of a user who initiated the request; and generate the tabular data based upon the selected conversion scheme. 2. The computing system of claim 1 , wherein the conversion scheme is selected from amongst the plurality of potential conversion schemes based upon a computer-implemented model of user behavior with respect to generation of tabular data from tree-structured documents. 3. The computing system of claim 1 , the data cleaning tool is further configured to: prior to selecting the conversion scheme from amongst the plurality of potential conversion schemes, construct a schema based upon a structure of the tree-structured document; and select the conversion scheme from amongst the plurality of potential conversion schemes based upon the constructed schema. 4. The computing system of claim 1 , wherein the tree-structured document comprises a first record and a second record, the first record includes a first field and the second record includes a second field, the first field includes a first list and the second field includes a second list of the same length as the first list, and further wherein the selected conversion scheme is configured to merge items in the first list with items in the second list such that a row-based entry in the tabular data includes a first item from the first list and a second item from the second list. 5. The computing system of claim 4 , wherein the selected conversion scheme, when applied to the tree-structured document, is configured to merge items from the first list with items the second list that are at the same level in a hierarchy of the tree-structured document. 6. The computing system of claim 1 , wherein the tree-structured document comprises a first record and a second record, the first record includes a first field and the second record includes a second field, the first field includes a first list and the second field includes a second list, and further wherein the selected conversion scheme is configured to generate a cross product of the first list and the second list, such that a column in the tabular data includes the cross product of the first list and the second list. 7. The computing system of claim 6 , wherein the selected conversion scheme is configured to generate the cross product of the first list and the second list only if the first list and the second list are at a same depth in the tree-structured document. 8. The computing system of claim 1 , wherein the tree-structured document is one of a JSON document or an XML document. 9. The computing system of claim 1 , the data cleaning tool is further configured to: prior to selecting the conversion scheme from the plurality of potential conversion schemes, receive, from a second user, a selection of a portion of the tree-structured document; and responsive to receiving the selection of the portion of the tree-structured document and based upon the portion of the tree-structured document, select the conversion scheme from the plurality of potential conversion schemes. 10. A computer-readable storage medium comprising instructions that, when executed by a processor, cause the processor to perform acts comprising: loading a JSON document into memory; receiving a request to generate tabular data based upon the JSON document; responsive to receiving the request, learning a schema for the JSON document based upon a structure of the JSON document; using the schema, selecting a conversion scheme from amongst a plurality of possible conversion schemes, wherein the conversion scheme, when receiving the JSON document as input, generates tabular data based upon at least a portion of the JSON document, wherein the conversion scheme is selected from amongst the plurality of potential conversion schemes based upon historic structure of tabular data in an enterprise division of a user who initiated the request; and generating tabular data based upon the selected conversion scheme. 11. The computer-readable storage medium of claim 10 , wherein the conversion scheme is selected from amongst the plurality of potential conversion schemes based upon a computer-implemented model of user behavior with respect to generation of tabular data from tree-structured documents. 12. A method executed by a processor of a computing system, the method comprising: loading a tree-structured document into memory of the computing system; receiving a request to generate tabular data based upon the tree-structured document; responsive to receiving the request, selecting a conversion scheme from amongst a plurality of potential conversion schemes, the selected conversion scheme is configured to generate the tabular data when the tree-structured document is received as input to the conversion scheme, wherein the conversion scheme is selected from amongst the plurality of potential conversion schemes based upon historic structure of tabular data in an enterprise division of a user who initiated the request; and generating the tabular data based upon the selected conversion scheme. 13. The method of claim 12 , wherein the conversion scheme is selected from amongst the plurality of potential conversion schemes based upon a computer-implemented model of user behavior with respect to generation of tabular data from tree-structured documents. 14. The method of claim 12 , further comprising: prior to selecting the conversion scheme from amongst the plurality of potential conversion schemes, constructing a schema based upon a structure of the tree-structured document; and selecting the conversion scheme from amongst the plurality of potential conversion schemes based upon the constructed schema. 15. The method of claim 12 , wherein the tree-structured document comprises a first record and a second record, the first record includes a first field and the second record includes a second field, the first field includes a first list and the second field includes a second list of the same length as the first list, and further wherein the selected conversion scheme is configured to merge items in the first list with items in the second list such that a row-based entry in the tabular data includes a first item from the first list and a second item from the second list. 16. The method of claim 15 , wherein the selected conversion scheme, when applied to the tree-structured document, is configured to merge items from the first list with items the second list that are at the same level in a hierarchy of the tree-structured document. 17. The method of claim 12 , wherein the tree-structured document comprises a first record and a second record, the first record includes a first field and the second record includes a second field, the first field includes a first list and the second field includes a second list, and further wherein th
Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title
Data format conversion from or to a database · CPC title
Tablespace storage structures; Management thereof · CPC title
Querying · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.