Who is the assignee on this patent?

Microsoft Technology Licensing Llc

What technology area does this patent fall under?

Primary CPC classification G06F16/215. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jun 23 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Generating tables based upon data extracted from tree-structured documents

US10691655B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10691655-B2
Application number	US-201615299312-A
Country	US
Kind code	B2
Filing date	Oct 20, 2016
Priority date	Oct 20, 2016
Publication date	Jun 23, 2020
Grant date	Jun 23, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Various technologies pertaining to extracting data encoded in a tree-structured document and generating a table based upon the extracted data are described herein. In a first embodiment, the table is generated without requiring input from a data cleaner. In a second embodiment, the table is generated based upon examples set forth by a data cleaner.

First claim

Opening claim text (preview).

What is claimed is: 1. A computing system comprising: at least one processor; and memory that stores a data cleaning tool, wherein the data cleaning tool, when executed by the at least one processor, is configured to: load a tree-structured document into the memory; receive a request to generate tabular data based upon the tree-structured document; responsive to receiving the request, select a conversion scheme from amongst a plurality of potential conversion schemes, the selected conversion scheme is configured to generate the tabular data when the tree-structured document is received as input to the conversion scheme, wherein the conversion scheme is selected from amongst the plurality of potential conversion schemes based upon historic structure of tabular data in an enterprise division of a user who initiated the request; and generate the tabular data based upon the selected conversion scheme. 2. The computing system of claim 1 , wherein the conversion scheme is selected from amongst the plurality of potential conversion schemes based upon a computer-implemented model of user behavior with respect to generation of tabular data from tree-structured documents. 3. The computing system of claim 1 , the data cleaning tool is further configured to: prior to selecting the conversion scheme from amongst the plurality of potential conversion schemes, construct a schema based upon a structure of the tree-structured document; and select the conversion scheme from amongst the plurality of potential conversion schemes based upon the constructed schema. 4. The computing system of claim 1 , wherein the tree-structured document comprises a first record and a second record, the first record includes a first field and the second record includes a second field, the first field includes a first list and the second field includes a second list of the same length as the first list, and further wherein the selected conversion scheme is configured to merge items in the first list with items in the second list such that a row-based entry in the tabular data includes a first item from the first list and a second item from the second list. 5. The computing system of claim 4 , wherein the selected conversion scheme, when applied to the tree-structured document, is configured to merge items from the first list with items the second list that are at the same level in a hierarchy of the tree-structured document. 6. The computing system of claim 1 , wherein the tree-structured document comprises a first record and a second record, the first record includes a first field and the second record includes a second field, the first field includes a first list and the second field includes a second list, and further wherein the selected conversion scheme is configured to generate a cross product of the first list and the second list, such that a column in the tabular data includes the cross product of the first list and the second list. 7. The computing system of claim 6 , wherein the selected conversion scheme is configured to generate the cross product of the first list and the second list only if the first list and the second list are at a same depth in the tree-structured document. 8. The computing system of claim 1 , wherein the tree-structured document is one of a JSON document or an XML document. 9. The computing system of claim 1 , the data cleaning tool is further configured to: prior to selecting the conversion scheme from the plurality of potential conversion schemes, receive, from a second user, a selection of a portion of the tree-structured document; and responsive to receiving the selection of the portion of the tree-structured document and based upon the portion of the tree-structured document, select the conversion scheme from the plurality of potential conversion schemes. 10. A computer-readable storage medium comprising instructions that, when executed by a processor, cause the processor to perform acts comprising: loading a JSON document into memory; receiving a request to generate tabular data based upon the JSON document; responsive to receiving the request, learning a schema for the JSON document based upon a structure of the JSON document; using the schema, selecting a conversion scheme from amongst a plurality of possible conversion schemes, wherein the conversion scheme, when receiving the JSON document as input, generates tabular data based upon at least a portion of the JSON document, wherein the conversion scheme is selected from amongst the plurality of potential conversion schemes based upon historic structure of tabular data in an enterprise division of a user who initiated the request; and generating tabular data based upon the selected conversion scheme. 11. The computer-readable storage medium of claim 10 , wherein the conversion scheme is selected from amongst the plurality of potential conversion schemes based upon a computer-implemented model of user behavior with respect to generation of tabular data from tree-structured documents. 12. A method executed by a processor of a computing system, the method comprising: loading a tree-structured document into memory of the computing system; receiving a request to generate tabular data based upon the tree-structured document; responsive to receiving the request, selecting a conversion scheme from amongst a plurality of potential conversion schemes, the selected conversion scheme is configured to generate the tabular data when the tree-structured document is received as input to the conversion scheme, wherein the conversion scheme is selected from amongst the plurality of potential conversion schemes based upon historic structure of tabular data in an enterprise division of a user who initiated the request; and generating the tabular data based upon the selected conversion scheme. 13. The method of claim 12 , wherein the conversion scheme is selected from amongst the plurality of potential conversion schemes based upon a computer-implemented model of user behavior with respect to generation of tabular data from tree-structured documents. 14. The method of claim 12 , further comprising: prior to selecting the conversion scheme from amongst the plurality of potential conversion schemes, constructing a schema based upon a structure of the tree-structured document; and selecting the conversion scheme from amongst the plurality of potential conversion schemes based upon the constructed schema. 15. The method of claim 12 , wherein the tree-structured document comprises a first record and a second record, the first record includes a first field and the second record includes a second field, the first field includes a first list and the second field includes a second list of the same length as the first list, and further wherein the selected conversion scheme is configured to merge items in the first list with items in the second list such that a row-based entry in the tabular data includes a first item from the first list and a second item from the second list. 16. The method of claim 15 , wherein the selected conversion scheme, when applied to the tree-structured document, is configured to merge items from the first list with items the second list that are at the same level in a hierarchy of the tree-structured document. 17. The method of claim 12 , wherein the tree-structured document comprises a first record and a second record, the first record includes a first field and the second record includes a second field, the first field includes a first list and the second field includes a second list, and further wherein th

Assignees

Microsoft Technology Licensing Llc

Inventors

Classifications

G06F16/215Primary
Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title
G06F16/258
Data format conversion from or to a database · CPC title
G06F16/2282
Tablespace storage structures; Management thereof · CPC title
G06F16/83
Querying · CPC title

Patent family

Related publications grouped by family.

View patent family 60183142

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10691655B2 cover?: Various technologies pertaining to extracting data encoded in a tree-structured document and generating a table based upon the extracted data are described herein. In a first embodiment, the table is generated without requiring input from a data cleaner. In a second embodiment, the table is generated based upon examples set forth by a data cleaner.
Who is the assignee on this patent?: Microsoft Technology Licensing Llc
What technology area does this patent fall under?: Primary CPC classification G06F16/215. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jun 23 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Activity information schema discovery and schema change detection and notification

Framework for data extraction by examples

Extracting relational data from semi-structured spreadsheets

Frequently asked questions