Translating natural language descriptions to programs in a domain-specific language for spreadsheets

US9330090B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9330090-B2
Application numberUS-201313753507-A
CountryUS
Kind codeB2
Filing dateJan 29, 2013
Priority dateJan 29, 2013
Publication dateMay 3, 2016
Grant dateMay 3, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system and method to translate natural language descriptions to programs in a domain-specific language for spreadsheets. The method includes generating a model of a spreadsheet. The model includes a column description for each column, and one or more types associated with each column. The method also includes normalizing the description by removing stop words, and replacing parts that match column names or data values by parameterized place-holders. The method involves applying rule-based translation along with keyword or type-based program synthesis in an inter-leaved, bottom-up manner and dynamic programming style, where phrases are mapped to sub-programs in increasing order of their length. The rules describe how to map a specific partial natural language phrase into a partial sub-program. Also, the method includes generating a number of potential programs and ranking the programs to sequence them according to their intended likelihood.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for translating natural language descriptions to programs in a domain-specific language for spreadsheet documents, the method comprising: generating a model of a spreadsheet document generated by a spreadsheet program, comprising a column description for each column, and one or more types associated with each column; identifying phrases in the natural language description that match with a column name or a data value in a column based on the model; generating a normalized description for the original description based on the identified phrases; generating a plurality of programs in an underlying domain-specific language from the normalized description by applying a combination of rule-driven translation and type-based program synthesis; and ranking the generated programs in an order that reflects their likelihood. 2. The method recited in claim 1 , comprising presenting the programs in association with the spreadsheet document. 3. The method recited in claim 2 , comprising presenting an explanation of each of the programs in association with the potential expression. 4. The method recited in claim 3 , wherein the explanation comprises a paraphrasing of each of the programs. 5. The method recited in claim 3 , wherein the explanation comprises a highlighting of a column whose values are operated on by the potential program. 6. The method recited in claim 3 , comprising executing the potential program on the spreadsheet document, wherein the explanation comprises a result of executing the potential program. 7. The method recited in claim 1 , wherein the underlying domain-specific language includes filtering operations, reduce operations, arbitrary composition of these operations, and support for operations in specific domains. 8. The method recited in claim 1 , wherein the normalization requires removing stop words and replacing phrases that match with column names and data-values by appropriate parameter place-holders. 9. The method recited in claim 1 , comprising: determining that the natural language description does not specify a parameter a sub-program is associated with in one of the programs, wherein the parameter comprises a data type, and wherein the spreadsheet document comprises no more than one column of the data type; and associating the parameter with the one column. 10. The method of claim 1 , comprising providing feedback on the natural language description, wherein the feedback comprises information relevant to a correction to the natural language description. 11. A system for translating natural language descriptions to programs in spreadsheet documents, the system comprising: a processing unit; and a system memory, wherein the system memory comprises code configured to direct the processing unit to: generate a model of a spreadsheet document generated by a spreadsheet program, comprising a column description for each column, and one or more types associated with each column; generate a plurality of potential programs in a domain-specific language after normalizing the natural language description based on the model, and using a combination of rule-based translation and type-based program synthesis; present a first of the potential programs based on a likelihood the one potential program is associated with the intended program; and present a second of the potential interpretations in response to a request from a spreadsheet user. 12. The system recited in claim 11 , comprising code configured to direct the processing unit to identify, in a client interface, stop words that are not included in the model. 13. The system recited in claim 11 , comprising code configured to direct the processing unit to present an explanation of each of the potential programs in association with the potential program. 14. The system recited in claim 11 , wherein the explanation comprises one of: a natural language paraphrasing of the potential program; or a highlighting of a column whose values are operated on by the potential program. 15. The system recited in claim 11 , comprising code configured to direct the processing unit to execute the potential program on the spreadsheet document, wherein the explanation comprises a result of execution of the potential program. 16. The system recited in claim 11 , wherein the normalization process comprises one of: removing a stop word; or replacing the phrase that refers to a column name of the spreadsheet document or a data value in the spreadsheet document by parameterized place-holders. 17. The system recited in claim 11 , comprising code configured to direct the processing unit to: determine that the natural language description does not specify a parameter a sub-program associated with one of the potential programs, wherein the parameter comprises a data type, and wherein the spreadsheet document comprises no more than one column of the data type; and associate the parameter with the one column. 18. One or more computer-readable storage media for translating natural language descriptions to expressions in spreadsheet documents, the computer-readable storage media comprising code configured to direct a processing unit to: generate a model of a spreadsheet document generated by a spreadsheet program, comprising a column description for each column of the spreadsheet document, and one or more types associated with each column; generate a plurality of potential programs in an underlying domain-specific language from the natural language description by first normalizing the description based on the model, and then using a translation system that uses a combination of rule-based translation and type-based program synthesis; and present a first of the potential interpretations based on a likelihood the one potential interpretation is associated with the intended program; and present a result of the program in association with a selection of one or more columns, wherein the columns are associated with the program, and wherein the result is based on the columns. 19. The one or more computer-readable storage media recited in claim 18 , wherein the normalization process comprises one of: removing a stop word; or replacing the phrase that refers to a column name of the spreadsheet document or a data value in the spreadsheet document by parameterized place-holders. 20. The one or more computer-readable storage media recited in claim 18 , comprising code configured to direct the processing unit to: determine that the domain-specific language description does not specify a parameter for a sub-program associated with one of the potential programs, wherein the parameter comprises a data type, and wherein the spreadsheet document comprises no more than one column of the data type; and associate the parameter with the one column.

Assignees

Inventors

Classifications

  • G06F40/40Primary

    Processing or translation of natural language (natural language analysis G06F40/20; semantic analysis G06F40/30) · CPC title

  • G06F8/10Primary

    Requirements analysis; Specification techniques · CPC title

  • of spreadsheets (form-filling G06F40/174) · CPC title

  • Physics · mapped topic

  • G06F17/28Primary

    Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9330090B2 cover?
A system and method to translate natural language descriptions to programs in a domain-specific language for spreadsheets. The method includes generating a model of a spreadsheet. The model includes a column description for each column, and one or more types associated with each column. The method also includes normalizing the description by removing stop words, and replacing parts that match c…
Who is the assignee on this patent?
Microsoft Corp, Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F40/40. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 03 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).