Modeling interestingness with deep neural networks
US-9846836-B2 · Dec 19, 2017 · US
US10380259B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10380259-B2 |
| Application number | US-201715601016-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 22, 2017 |
| Priority date | May 22, 2017 |
| Publication date | Aug 13, 2019 |
| Grant date | Aug 13, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Mechanisms are provided to perform embedding of content of a natural language document. The mechanisms receive a document data object of an electronic document and analyze a structure of the electronic document to identify one or more structural document elements that have a relationship with the document data object. A dependency data structure is generated, representing the electronic document, where edges define relationships between document elements and at least one edge represents at least one relationship between the one or more structural document elements and the document data object. The mechanisms embed the document data object based on the at least one relationship to thereby represent the document data object as a vector data structure. The mechanisms perform natural language processing on the portion of natural language content based on the vector data structure. The one or more structural document elements are non-local non-contiguous with the document data object.
Opening claim text (preview).
What is claimed is: 1. A method, in a data processing system comprising a processor and a memory, the memory comprising instructions that are executed by the processor to configure the processor to implement a natural language embedding engine, the method comprising: receiving, by the natural language embedding engine executing on the processor, a document data object of an electronic document; analyzing, by the natural language embedding engine, a structure of the electronic document to identify one or more structural document elements that have a relationship with the document data object; generating, by the natural language embedding engine, a dependency data structure representing the electronic document, wherein edges in the dependency data structure define relationships between document elements, and wherein at least one edge is generated in the dependency data structure to represent at least one relationship between the one or more structural document elements and the document data object; executing, by the natural language embedding engine, an embedding operation on the document data object based on the at least one relationship in the dependency data structure to thereby represent the document data object as a vector data structure; and performing, by a natural language processing engine executing in the data processing system, a natural language processing operation on the document data object based on the vector data structure, wherein the one or more structural document elements comprise one or more structural document elements that are non-local non-contiguous with the document data object, wherein the natural language processing system is a question and answer system, and wherein preforming the natural language processing operation on the document data object based on the vector data structure comprises performing, by the question and answer system, a question answering operation based on a received input natural language question, and generating at least one answer to the received input natural language question based on the vector data structure associated with the document data object. 2. The method of claim 1 , wherein the document data object is at least one of a natural language text data object comprising a portion of natural language textual content of the electronic document, or a non-natural language text data object representing an image, table, or other portion of non-textual content in the electronic document. 3. The method of claim 1 , wherein the document data object comprises a natural language sentence of the electronic document, and wherein the one or more structural document elements comprise at least one of a title of the electronic document or a section title of a section within the electronic document. 4. The method of claim 1 , wherein the document data object comprises an image or table in content of the electronic document, and wherein the at least one structural document element comprises a reference to the image or table. 5. The method of claim 1 , wherein the one or more structural document elements comprise at least one of: a link to another electronic document, wherein the at least one edge representing at least one relationship between the one or more structural document elements and the document data object comprises an edge representing a relationship between content of the other electronic document, and the document data object, or an association of the document data object with data in an external knowledge base, wherein the at least one edge representing at least one relationship between the one or more structural document elements and the document data object comprises an edge representing a relationship between content of the external knowledge base, and the document data object. 6. The method of claim 1 , wherein analyzing the structure of the electronic document to identify the one or more structural document elements that have a relationship with the document data object comprises applying one or more rules defining dependency relationships between various types of structural document elements and document data objects in content of electronic documents. 7. The method of claim 1 , wherein generating the dependency data structure comprises: generating edges as a dependency tuple having a first tuple element identifying a dependent document element, a second tuple element representing a dependency relationship, and a third tuple element representing a document element which depends on the first write element; and aggregating, for each document element in the electronic document, dependency triples referencing the document element. 8. The method of claim 1 , wherein executing an embedding operation on the document data object based on the at least one relationship in the dependency data structure to thereby represent the document data object as a vector data structure comprises: inputting the document data object into a trained neural network comprising a plurality of embedding encoders and at least one embedding decoder; processing, by the plurality of embedding encoders, the document data object to generate an embedded document data object comprising the vector data structure, wherein each embedding encoder performs an encoding operation on the document data object with respect to a different type of structural document element; and outputting, by the neural network, the embedded document data object to the natural language processing engine. 9. A computer program product comprising a computer readable storage medium having a computer readable program stored therein, wherein the computer readable program, when executed in a data processing system, causes data processing system to: receive, by a natural language embedding engine executing in the data processing system, a document data object of an electronic document; analyzing, by the natural language embedding engine, a structure of the electronic document to identify one or more structural document elements that have a relationship with the document data object; generate, by the natural language embedding engine, a dependency data structure representing the electronic document, wherein edges in the dependency data structure define relationships between document elements, and wherein at least one edge is generated in the dependency data structure to represent at least one relationship between the one or more structural document elements and the document data object; execute, by the natural language embedding engine, an embedding operation on the document data object based on the at least one relationship in the dependency data structure to thereby represent the document data object as a vector data structure; and perform, by a natural language processing engine executing in the data processing system, a natural language processing operation on the document data object based on the vector data structure, wherein the one or more structural document elements comprise one or more structural document elements that are non-local non-contiguous with the document data object, wherein the natural language processing system is a question and answer system, and wherein preforming the natural language processing operation on the document data object based on the vector data structure comprises performing, by the question and answer system, a question answering operation based on a received input natural language question, and generating at least one answer to the received input natural language question based on the vector data structure associated with the document data object. 10. The computer program product of claim 9 , wherein the document data object is at least one of a natural language text data object comprisin
Related publications grouped by family.
Answers are generated from the same data shown on this page.