Transformation in tabular data cleaning tool

US11645453B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11645453-B2
Application numberUS-202217808239-A
CountryUS
Kind codeB2
Filing dateJun 22, 2022
Priority dateJun 1, 2018
Publication dateMay 9, 2023
Grant dateMay 9, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system including first computer memory storing a full data set representable in rows and columns, a second computer memory storing executable instructions, and processors configured to execute the instructions to cause presentation of data of the full data set on a display including columns of data each having data fields, receive user input identifying a column of the data set, determine items to modify in information in the data fields of the identified column, generate and cause display of an indication of a proposed change action to modify the determined items, and in response to a user input indicating a selection of the indication of the proposed change action, update the presentation of the data based on the change action to modify information displayed in the data fields of the identified column of the data, and store a log of the change action.

First claim

Opening claim text (preview).

What is claimed is: 1. A system, comprising: one or more computer processors configured to execute computer-executable instructions to cause the system to at least: generate and cause display, on a display device, data arranged in tabular format including a plurality of columns and rows representing a subset of data that is a portion of a first data set, the displayed columns showing only a portion of the first data set, the first data set including data field information; in response to a selection of a portion of the displayed columns, generate and cause display of data metrics, of the first data set, for data field information in the selected portion of the columns for all the rows including rows of the selected columns that are not displayed; generate and cause display of a change action to modify data field information of the first data set; and in response to a selection of the change action, cause update of the data metrics on the display device, the update of the data metrics reflecting changes that would be made in the first data set caused by the selected change action without modifying the first data set. 2. The system of claim 1 , wherein the one or more computer hardware processors are further configured to execute the computer-executable instructions to store a log of the change action. 3. The system of claim 1 , wherein the one or more computer hardware processors are further configured to execute the computer-executable instructions to: access the log; apply each change action in the log to the first data set; and save a second data set, the second data set including modifications made to the first data set based on the change action in the log. 4. The system of claim 2 , wherein the one or more computer hardware processors are further configured to execute the computer-executable instructions to iteratively: receive user input indicating a selection of one of a plurality of displayed change actions; and store a log of selected change actions. 5. The system of claim 4 , wherein the one or more computer hardware processors are further configured to execute the computer-executable instructions to: access the log; apply each change action in the log to the first data set; and save a second data set, the second data set including modifications made to the first data set based on the change action in the log. 6. The system of claim 4 , wherein the one or more computer hardware processors are further configured to execute the computer-executable instructions to: access the log; apply each change action stored in the log to a second data set to create a modified second data set; and save the modified second data set as a third data set, the third data set including all of the modifications made to the second data set by the change actions stored in the log. 7. The system of claim 1 , wherein the plurality of rows of the first data set includes more than tens thousand of rows of data. 8. The system of claim 1 , wherein the change action includes a suggested correction to information in data fields of the identified column and an indicator of importance of the suggested correction. 9. The system of claim 1 , wherein the change action includes modifying the data field information to at least one of: changing the spelling of a word; changing the case of letters; deleting a space; adding a space; deleting a period, comma, semi-colon, or colon; or adding a period, comma, semi-colon, or colon. 10. The system of claim 1 , wherein the change action comprises replacing data fields that include first information with second information. 11. The system of claim 1 , wherein the change action includes changing a data type of at least one data field in the first data set. 12. The system of claim 1 , wherein cause update of the data metrics on the display comprises concatenating at least one alphanumeric character or punctuation to information in a plurality of data fields. 13. A method of preparing tabular representable data for further processing, the method comprising: generating and causing display on a display device data arranged in tabular format including a plurality of columns and rows representing a subset of data that is a portion of a first data set, the displayed columns showing only a portion of the first data set, the first data set including data field information; in response to a selection of a portion of the displayed columns, generating and causing display of data metrics, of the first data set, for data field information in the selected portion of the columns for all the rows including rows of the selected columns that are not displayed; generating and causing display of a change action to modify data field information of the first data set; and in response to a selection of the change action, causing update of the data metrics on the display device, the update of the data metrics reflecting changes that would be made in the first data set caused by the selected change action without modifying the first data set. 14. The method of claim 13 , further comprising storing a log of the change action. 15. The method of claim 14 , further comprising: accessing the log; applying each change action in the log to the first data set; and saving a second data set, the second data set including modifications made to the first data set based on the change action in the log. 16. The method of claim 14 , further comprising: accessing the log; applying each change action stored in the log to a second data set to create a modified second data set; and saving the modified second data set as a third data set, the third data set including all of the modifications made to the second data set by the change actions stored in the log. 17. The method of claim 13 , wherein the plurality of rows of the first data set includes more than tens thousand of rows of data. 18. The method of claim 13 , wherein the change action includes a suggested correction to information in data fields of the identified column and an indicator of importance of the suggested correction. 19. The method of claim 13 , further comprising generating and causing display of tool buttons that when selected allow filtering operations on selected data. 20. The method of claim 13 , wherein the change action includes modifying the data field information to at least one of: changing the spelling of a word; changing the case of letters; deleting a space; adding a space; deleting a period, comma, semi-colon, or colon; or adding a period, comma, semi-colon, or colon.

Assignees

Inventors

Classifications

  • Column-oriented storage; Management thereof · CPC title

  • of spreadsheets (form-filling G06F40/174) · CPC title

  • Change logging, detection, and notification (replication G06F16/27) · CPC title

  • G06F40/166Primary

    Editing, e.g. inserting or deleting · CPC title

  • Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11645453B2 cover?
A system including first computer memory storing a full data set representable in rows and columns, a second computer memory storing executable instructions, and processors configured to execute the instructions to cause presentation of data of the full data set on a display including columns of data each having data fields, receive user input identifying a column of the data set, determine ite…
Who is the assignee on this patent?
Palantir Technologies Inc
What technology area does this patent fall under?
Primary CPC classification G06F40/166. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 09 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).