What technology area does this patent fall under?

Primary CPC classification G06F40/143. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Feb 15 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Context-aware knowledge base system

US11250204B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11250204-B2
Application number	US-201715831412-A
Country	US
Kind code	B2
Filing date	Dec 5, 2017
Priority date	Dec 5, 2017
Publication date	Feb 15, 2022
Grant date	Feb 15, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for generating a context-aware knowledge base is provided. The method may include extracting document object model (DOM) tag elements associated with one or more webpages. The method may further include identifying and extracting webpage data associated with the extracted DOM tags. The method may further include determining a context associated with the identified and extracted webpage data by detecting and extracting resource description framework (RDF) triplets in candidate DOM tag elements. The method may further include ranking the extracted RDF triplets. The method may also include validating one or more RDF triplets associated with the ranked RDF triplets. The method may further include connecting the validated RDF triplets to a knowledge graph associated with a knowledge base of the one or more webpages.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for generating a context-aware knowledge base, the method comprising: extracting document object model (DOM) tag elements associated with a webpage; identifying and extracting webpage data associated with a first DOM tag element from the extracted DOM tags; determining a context associated with the identified and extracted webpage data for the first DOM tag element, wherein determining the context comprises, detecting and extracting resource description framework (RDF) triplets in candidate DOM tag elements, wherein the candidate DOM tag elements are based on a determined relationship to the first DOM tag element and include parent and sibling DOM tag elements, and wherein detecting and extracting the RDF triplets comprises detecting and extracting the RDF triplets from the candidate DOM tag elements nearest the first DOM tag element and based on an order associated with the determined relationship until text is identified, and ranking the extracted RDF triplets based on a connection between the RDF triplets and the webpage data associated with the first DOM tag element; validating one or more RDF triplets associated with the ranked RDF triplets; and connecting the validated RDF triplets to a knowledge graph associated with a knowledge base of the webpage. 2. The method of claim 1 , wherein extracting the DOM tag elements associated with the webpage further comprises: determining a relationship between the extracted DOM tag elements. 3. The method of claim 1 , wherein identifying and extracting the webpage data associated with the first DOM tag element from the extracted DOM tags further comprises: extracting text associated with the first DOM tag element. 4. The method of claim 1 , wherein ranking the extracted RDF triplets further comprises: determining a confidence score for the extracted RDF triplets, wherein the confidence score represents a level of connection between an extracted subject and an extracted object associated with the extracted RDF triplets. 5. The method of claim 1 , wherein validating the one or more RDF triplets associated with the ranked RDF triplets further comprises: generating and setting one or more threshold confidence scores; and enabling a user to edit and validate the one or more RDF triplets associated with the ranked RDF triplets. 6. The method of claim 1 , further comprising: tracking changes to the validated RDF triplets. 7. A computer system for generating a context-aware knowledge base, comprising: one or more processors, one or more computer-readable memories, one or more computer-readable tangible storage devices, and program instructions stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, wherein the computer system is capable of performing a method comprising: extracting document object model (DOM) tag elements associated with a webpage; identifying and extracting webpage data associated with a first DOM tag element from the extracted DOM tags; determining a context associated with the identified and extracted webpage data for the first DOM tag element, wherein determining the context comprises, detecting and extracting resource description framework (RDF) triplets in candidate DOM tag elements, wherein the candidate DOM tag elements are based on a determined relationship to the first DOM tag element and include parent and sibling DOM tag elements, and wherein detecting and extracting the RDF triplets comprises detecting and extracting the RDF triplets from the candidate DOM tag elements nearest the first DOM tag element and based on an order associated with the determined relationship until text is identified, and ranking the extracted RDF triplets based on a connection between the RDF triplets and the webpage data associated with the first DOM tag element; validating one or more RDF triplets associated with the ranked RDF triplets; and connecting the validated RDF triplets to a knowledge graph associated with a knowledge base of the webpage. 8. The computer system of claim 7 , wherein extracting the DOM tag elements associated with the webpage further comprises: determining a relationship between the extracted DOM tag elements. 9. The computer system of claim 7 , wherein identifying and extracting the webpage data associated with the first DOM tag element from the extracted DOM tags further comprises: extracting text associated with the first DOM tag element. 10. The computer system of claim 7 , wherein ranking the extracted RDF triplets further comprises: determining a confidence score for the extracted RDF triplets, wherein the confidence score represents a level of connection between an extracted subject and an extracted object associated with the extracted RDF triplets. 11. The computer system of claim 7 , wherein validating the one or more RDF triplets associated with the ranked RDF triplets further comprises: generating and setting one or more threshold confidence scores; and enabling a user to edit and validate the one or more RDF triplets associated with the ranked RDF triplets. 12. The computer system of claim 7 , further comprising: tracking changes to the validated RDF triplets. 13. A computer program product for generating a context-aware knowledge base, comprising: one or more computer-readable storage devices and program instructions stored on at least one of the one or more tangible storage devices, the program instructions executable by a processor, the program instructions comprising: program instructions to extract document object model (DOM) tag elements associated with a webpage; program instructions to identify and extract webpage data associated with a first DOM tag element from the extracted DOM tags; program instructions to determine a context associated with the identified and extracted webpage data for the first DOM tag element, wherein determining the context comprises, program instructions to detect and extract resource description framework (RDF) triplets in candidate DOM tag elements, wherein the candidate DOM tag elements are based on a determined relationship to the first DOM tag element and include parent and sibling DOM tag elements, and wherein detecting and extracting the RDF triplets comprises detecting and extracting the RDF triplets from the candidate DOM tag elements nearest the first DOM tag element and based on an order associated with the determined relationship until text is identified, and program instructions to rank the extracted RDF triplets based on a connection between the RDF triplets and the webpage data associated with the first DOM tag element; program instructions to validate one or more RDF triplets associated with the ranked RDF triplets; and program instructions to connect the validated RDF triplets to a knowledge graph associated with a knowledge base of the webpage. 14. The computer program product of claim 13 , wherein the program instructions to extract the DOM tag elements associated with the webpage further comprises: program instructions to determine a relationship between the extracted DOM tag elements. 15. The computer program product of claim 13 , wherein the program instructions to rank the extracted RDF triplets further comprises: program instructions to determine a confidence score for the extracted RDF triplets, wherein the confidence score represents a level of connection between an extracted subject and an extracted object associated with the extracted RDF triplets. 16. The compute

Assignees

Inventors

Classifications

G06N5/022
Knowledge engineering; Knowledge acquisition · CPC title
G06F16/951
Indexing; Web crawling techniques · CPC title
G06N3/006
based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO] · CPC title
G06F40/143Primary
Markup, e.g. Standard Generalized Markup Language [SGML] or Document Type Definition [DTD] · CPC title

Patent family

Related publications grouped by family.

View patent family 66659302

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11250204B2 cover?: A method for generating a context-aware knowledge base is provided. The method may include extracting document object model (DOM) tag elements associated with one or more webpages. The method may further include identifying and extracting webpage data associated with the extracted DOM tags. The method may further include determining a context associated with the identified and extracted webpage…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G06F40/143. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Feb 15 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

System and method for storing and searching data extracted from text documents

Knowledge engine for managing massive complex structured data

Processing associations in knowledge graphs

Automated ontology building

Frequently asked questions