Systems and methods for preparing raw data for use in data visualizations

US10248720B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-10248720-B1
Application numberUS-201514801748-A
CountryUS
Kind codeB1
Filing dateJul 16, 2015
Priority dateJul 16, 2015
Publication dateApr 2, 2019
Grant dateApr 2, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for preparing raw data for use in data visualizations are described herein. A method includes selecting a sample of data values from an existing data column in a data source, identifying a first delimiting location within the sample, creating first and second new data values by splitting each data value in the existing data column at the first identified delimiting location, storing the first and second new data values in first and second new data columns, respectively, and assigning field names to the first and second new data columns. The method also includes displaying a schema information region, which includes the assigned field names, in a data visualization user interface. The method displays a data visualization that includes first and/or second new data values based on user selection of the assigned field names of the first or second new data columns.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of preparing data for use in data visualizations, comprising: at a computing device having one or more processors and memory storing one or more programs configured for execution by the one or more processors: selecting a sample of data values from an existing data column in a data source, wherein the sample includes a plurality of distinct data values; identifying a first delimiting location within the sample of data values, wherein identifying the first delimiting location includes: identifying a plurality of delimiting locations that includes the first delimiting location; and generating statistics describing occurrence of consistent delimiters within the sample of data values, each consistent delimiter comprising one or more non-alphanumeric characters that occur in a same sequence in a substantial majority of the sample of data values; creating first new data values and second new data values by splitting each data value in the existing data column at the first identified delimiting location; storing the first new data values in a first new data column in the data source and storing the second new data values in a second new data column in the data source; assigning field names to the first and second new data columns; displaying a schema information region in a data visualization user interface, wherein the schema information region includes field names of data columns in the data source, including the assigned field names of the first and second new data columns; and displaying a data visualization in the data visualization user interface including one or more of the first new data values based on user selection of the assigned field name of the first new data column from the schema information region. 2. The method of claim 1 , further comprising: assigning a first data type to the first new data column and a second data type to the second new data column, wherein the first data type is distinct from the second data type. 3. The method of claim 2 , wherein the first data type is a date data type and the method further comprises for each first new data value in the first new data column, identifying component date-parts using range and cardinality statistics. 4. The method of claim 1 , wherein assigning field names to the first and second new data columns includes receiving user input to assign field names to the first and second new data columns. 5. The method of claim 1 , wherein assigning field names to the first and second new data columns includes automatically assigning default field names to the first and second new data columns based on metadata associated with the first and second new data values, respectively. 6. The method of claim 5 , further comprising: modifying at least one of the assigned default field names based on user input. 7. A computing device, comprising: one or more processors; memory; and one or more programs stored in the memory for execution by the one or more processors, the one or more programs comprising instructions for: selecting a sample of data values from an existing data column in a data source, wherein the sample includes a plurality of distinct data values; identifying a first delimiting location within the sample of data values, wherein identifying the first delimiting location includes: identifying a plurality of delimiting locations that includes the first delimiting location; and generating statistics describing occurrence of consistent delimiters within the sample of data values, each consistent delimiter comprising one or more non-alphanumeric characters that occur in a same sequence in a substantial majority of the sample of data values; creating first new data values and second new data values by splitting each data value in the existing data column at the first identified delimiting location; storing the first new data values in a first new data column in the data source and storing the second new data values in a second new data column in the data source; assigning field names to the first and second new data columns; displaying a schema information region in a data visualization user interface, wherein the schema information region includes field names of data columns in the data source, including the assigned field names of the first and second new data columns; and displaying a data visualization in the data visualization user interface including one or more of the first new data values based on user selection of the assigned field name of the first new data column from the schema information region. 8. A method of preparing data for use in data visualizations, comprising: at a computing device having one or more processors and memory storing one or more programs configured for execution by the one or more processors: receiving a first indication from a user that a first data column within a data source contains integer-only values representing unformatted dates; determining a minimum value and a maximum value of the integer-only values in the first data column; matching the integer-only values in the first data column to a first date format based on a determination that a range of values associated with the first date format includes the minimum value and the maximum value of the integer-only values in the first data column; selecting a sample of the integer-only values in the first data column and applying the first date format to each integer-only value within the sample of the integer-only values to produce a respective formatted date; applying the first date format to each integer-only value of the integer-only values in the first data column to produce a respective formatted date; storing, in the data source, each formatted date in a first new data column in a respective row corresponding to the respective integer-only value; displaying a schema information region in a data visualization user interface, wherein the schema information region includes a field name of the first new data column; and displaying a data visualization in the data visualization user interface including at least one formatted date based on user selection of the field name of the first new data column from the schema information region. 9. The method of claim 8 , wherein applying the first date format to each integer-only value of the integer-only values in the first data column includes: determining whether range and cardinality statistics associated with the first date format describe component date-parts of the formatted dates; and in accordance with a determination that the range and cardinality statistics associated with the first date format describe component date-parts of the formatted dates, applying the first date format to each integer-only value of the integer-only values in the first data column and storing each formatted date in the first new data column. 10. The method of claim 8 , further comprising: receiving a second indication that a second data column within the data source contains integer-only values representing unformatted dates; applying the first date format to each integer-only value of the integer-only values in the second data column to produce a respective formatted date; and storing each formatted date in a second new data column in the data source. 11. The method of claim 10 , wherein applying the first date format to each integer-only value of the integer-only values in the second data column includes: determining a minimum value and a maximum value of the integer-only values in the second data column; and in accordance with a determination that the range of values associated with the first date format includes the minimum value and the maximum va

Assignees

Inventors

Classifications

  • Physics · mapped topic

  • Physics · mapped topic

  • Browsing; Visualisation therefor · CPC title

  • G06F16/34Primary

    Browsing; Visualisation therefor (browsing or visualisation for clustering or classification G06F16/358) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10248720B1 cover?
Systems and methods for preparing raw data for use in data visualizations are described herein. A method includes selecting a sample of data values from an existing data column in a data source, identifying a first delimiting location within the sample, creating first and second new data values by splitting each data value in the existing data column at the first identified delimiting location,…
Who is the assignee on this patent?
Tableau Software Inc
What technology area does this patent fall under?
Primary CPC classification G06F17/30716. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 02 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).