Managing sharable cell-based analytical notebooks

US10002163B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10002163-B2
Application numberUS-201715673231-A
CountryUS
Kind codeB2
Filing dateAug 9, 2017
Priority dateAug 18, 2016
Publication dateJun 19, 2018
Grant dateJun 19, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In an embodiment, a data processing method comprises creating and storing a plurality of analytical notebooks in digital computer storage, wherein each of the analytical notebooks comprises notebook metadata that specifies a kernel for execution, and one or more computational cells, wherein each of the cells comprises cell metadata, a source code reference and an output reference; receiving, in association with a first cell among the one or more cells, first input specifying computer program source code of a function, wherein the function defines an input dataset, a transformation, and one or more variables associated with output data; storing the first cell, excluding the output data, using a first digital data storage system and updating the source code reference to identify the first data storage system; using the kernel specified in the notebook metadata, executing an executable version of the source code to result in generating the output data; storing the output data using a second digital data storage system that is separate from the first digital data storage system and updating the output reference to identify the second data storage system.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: creating and storing a plurality of analytical notebooks in digital computer storage, wherein each of the analytical notebooks comprises notebook metadata that specifies a kernel for execution, and one or more computational cells, wherein each of the cells comprises cell metadata, a source code reference and an output reference; receiving, in association with a first cell among the one or more cells, first input specifying computer program source code of a function, wherein the function defines an input dataset, a transformation, and one or more variables associated with output data; storing the first cell, excluding the output data, using a first digital data storage system and updating the source code reference to identify the first data storage system; using the kernel specified in the notebook metadata, executing an executable version of the source code to result in generating the output data; storing the output data using a second digital data storage system that is separate from the first digital data storage system and updating the output reference to identify the second data storage system; hosting a first analytical notebook from among the plurality of analytical notebooks in a first user container of a containerized program execution system in a virtual computing environment, and hosting a second analytical notebook from among the plurality of analytical notebooks in a second user container of the same containerized program execution system, wherein the second user container is different than the first user container, wherein the method is performed using one or more processors. 2. The method of claim 1 , further comprising starting execution of a first plurality of different execution kernels in the first user container and starting execution of a second plurality of different execution kernels in the second user container. 3. The method of claim 1 , further comprising: determining names of variables that are then currently in memory representing a local scope of the first cell and obtaining then-current values of the variables; generating and displaying a view of the names of the variables and the then-current values of the variables in a user interface that also shows the first cell. 4. The method of claim 1 , further comprising: generating and displaying a text entry box associated with a search function; receiving a search term in the text entry box; searching one or more data repositories to identify one or more items of cell metadata that matches the search term; generating and displaying a list of functions of cells or notebooks which functions are associated with cell metadata that matches the search term. 5. The method of claim 1 wherein the notebook metadata specifies any of R, PYTHON or MATLAB as the kernel for execution. 6. A method comprising: creating and storing a plurality of analytical notebooks in digital computer storage, wherein each of the analytical notebooks comprises notebook metadata that specifies a kernel for execution, and one or more computational cells, wherein each of the cells comprises cell metadata, a source code reference and an output reference; receiving, in association with a first cell among the one or more cells, first input specifying computer program source code of a function, wherein the function defines an input dataset, a transformation, and one or more variables associated with output data; storing the first cell, excluding the output data, using a first digital data storage system and updating the source code reference to identify the first data storage system; using the kernel specified in the notebook metadata, executing an executable version of the source code to result in generating the output data; storing the output data using a second digital data storage system that is separate from the first digital data storage system and updating the output reference to identify the second data storage system; creating and storing, as part of the cell metadata, a library versionset value that represents all program code libraries and all version numbers of the program code libraries on which the source depends; and creating and storing, as part of the cell metadata, a dataset versionset value that represents version values for one or more datasets that the source code specifies as input sources, wherein the method is performed using one or more processors. 7. The method of claim 6 , further comprising: receiving input requesting to execute the first cell; determining whether the first cell is connected to program code libraries having version numbers that correspond to the library versionset value in the cell metadata of the first cell; performing one or more generating a notification message blocking execution of the first cell when the first cell is connected to program code libraries having version numbers that do not correspond to the library versionset value in the cell metadata of the first cell. 8. The method of claim 7 , further comprising: determining whether the first cell is connected to one or more datasets that the source code specifies as input sources and having dataset version values that match a dataset versionset value in the cell metadata of the first cell; performing one or more of generating a notification message or blocking execution of the first cell when the first cell is connected to one or more datasets having dataset version numbers that do not correspond to the dataset versionset value in the cell metadata of the first cell. 9. A method comprising: creating and storing a plurality of analytical notebooks in digital computer storage, wherein each of the analytical notebooks comprises notebook metadata that specifies a kernel for execution, and one or more computational cells, wherein each of the cells comprises cell metadata, a source code reference and an output reference; receiving, in association with a first cell among the one or more cells, first input specifying computer program source code of a function, wherein the function defines an input dataset, a transformation, and one or more variables associated with output data; storing the first cell, excluding the output data, using a first digital data storage system and updating the source code reference to identify the first data storage system; using the kernel specified in the notebook metadata, executing an executable version of the source code to result in generating the output data; storing the output data using a second digital data storage system that is separate from the first digital data storage system and updating the output reference to identify the second data storage system; receiving input that is associated with adding a data entry dashboard to the first cell; in response to the input, automatically creating and displaying a data entry dashboard in association with the first cell, wherein the data entry dashboard comprises a graphical user input panel having a plurality of user interface widgets, wherein each of the user interface widgets matches a data type of a variable that is defined in the source code; receiving a plurality of data values in the user interface widgets; causing re-execution of the source code of the first cell using the plurality of data values that were received via the user interface widgets to result in generating updated output data based on the plurality of data values, wherein the method is performed using one or more processors. 10. A computer system comprising: one or more processors; one or more non-transitory computer-readable storage media storing instructions which, when executed by the one or more processors, cause the one or more processors to pe

Assignees

Inventors

Classifications

  • for implementing user interfaces · CPC title

  • LOAD or STORE instructions; Clear instruction · CPC title

  • Physics · mapped topic

  • Logic programming, e.g. PROLOG programming language · CPC title

  • G06F8/20Primary

    Software design · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10002163B2 cover?
In an embodiment, a data processing method comprises creating and storing a plurality of analytical notebooks in digital computer storage, wherein each of the analytical notebooks comprises notebook metadata that specifies a kernel for execution, and one or more computational cells, wherein each of the cells comprises cell metadata, a source code reference and an output reference; receiving, in…
Who is the assignee on this patent?
Palantir Technologies Inc
What technology area does this patent fall under?
Primary CPC classification G06F17/30516. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 19 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).