Semi-structured data machine learning
US-11568304-B1 · Jan 31, 2023 · US
US12499373B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12499373-B2 |
| Application number | US-202219114491-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 27, 2022 |
| Priority date | Sep 27, 2022 |
| Publication date | Dec 16, 2025 |
| Grant date | Dec 16, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Various embodiments of the teachings herein include a method for creating a knowledge graph in the industrial field. An example includes: obtaining unstructured data from a first source in a sub-field of the industrial field, with knowledge annotations; performing machine learning on the unstructured data to generate a first model adapted to extract knowledge; extracting knowledge from second unstructured data provided by the first source based on the first model, without knowledge annotations; obtaining first structured data and first semi-structured data from a second source in a second sub-field; extracting second knowledge from the first structured data; extracting third knowledge from the first semi-structured data; and building a knowledge graph integrating the first and second sub-field based on the first, second, and third knowledge, represented in the form of triples.
Opening claim text (preview).
What is claimed is: 1 . A method for creating a knowledge graph in the industrial field, the method comprising: obtaining first unstructured data from a first data source in a first sub-field of the industrial field, wherein the first unstructured data carries knowledge annotations; performing machine learning on the first unstructured data to generate a first model adapted to extract knowledge; extracting first knowledge from second unstructured data provided by the first data source based on the first model, wherein the second unstructured data does not carry knowledge annotations; obtaining first structured data and first semi-structured data from a second data source in a second sub-field of the industrial field; extracting second knowledge from the first structured data; extracting third knowledge from the first semi-structured data; and building a knowledge graph integrating the first sub-field and the second sub-field based on the first knowledge, the second knowledge and the third knowledge, wherein the first knowledge, the second knowledge, and the third knowledge are all represented in the form of triples. 2 . The method according to claim 1 , wherein the proportion of unstructured data in the first sub-field is greater than a predetermined threshold, and the proportion of unstructured data in the second sub-field is less than the threshold. 3 . The method according to claim 2 , wherein the first sub-field and the second sub-field belong to a single industrial category. 4 . The method according to claim 3 , wherein the first sub-field and the second sub-field belong to a single industrial sub-category. 5 . The method according to claim 1 , further comprising: obtaining second structured data and second semi-structured data from the first data source; extracting fourth knowledge from the second structured data; and extracting fifth knowledge from the second semi-structured data; wherein the building knowledge graph integrating the first sub-field and the second sub-field based on the first knowledge, the second knowledge and the third knowledge comprises: building the knowledge graph based on the first knowledge, the second knowledge, the third knowledge, the fourth knowledge and the fifth knowledge. 6 . The method according to claim 5 , wherein building the knowledge graph integrating the first sub-field and the second sub-field comprises: building a knowledge graph of the first sub-field based on the first knowledge, the fourth knowledge, and the fifth knowledge; building a knowledge graph of the second sub-field based on the second knowledge and the third knowledge; and combining the knowledge graph of the first sub-field and the knowledge graph of the second sub-field into a knowledge graph of the first sub-field and the second sub-field; wherein comparing attributes of an entity in the knowledge graph of the first sub-field with respective attributes of an entity in the knowledge of graph the second sub-field, determining similarity between the entity in the knowledge graph of the first sub-field and the entity in the knowledge graph of the second sub-field; combining the entity in the knowledge graph of the first sub-field and the entity in the knowledge graph of the second sub-field when the similarity is higher than a preset threshold. 7 . An apparatus for creating a knowledge graph in the industrial field, the apparatus comprising: a first obtaining module to obtain first unstructured data from a first data source in a first sub-field of the industrial field, wherein the first unstructured data carries knowledge annotations; a performing module to perform machine learning on the first unstructured data to generate a first model adapted to extract knowledge; a first extracting module to extract first knowledge from second unstructured data provided by the first data source based on the first model, wherein the second unstructured data does not carry knowledge annotations; a second obtaining module to obtain first structured data and first semi-structured data from a second data source in a second sub-field of the industrial field; a second extracting module to extract second knowledge from the first structured data; a third extracting module to extract third knowledge from the first semi-structured data; and a building module to build a knowledge graph integrating the first sub-field and the second sub-field based on the first knowledge, the second knowledge and the third knowledge, wherein the first knowledge, the second knowledge and the third knowledge are all represented in the form of triples. 8 . The apparatus according to claim 7 , wherein the proportion of unstructured data in the first sub-field is greater than a predetermined threshold, and the proportion of unstructured data in the second sub-field is less than the threshold. 9 . The apparatus according to claim 8 , wherein the first sub-field and the second sub-field belong to a single industrial category. 10 . The apparatus according to claim 9 , wherein the first sub-field and the second sub-field belong to a single industrial sub-category. 11 . The apparatus according to claim 7 , wherein: the first obtaining module is configured to obtain second structured data and second semi-structured data from the first data source; the first extracting module is configured to extract fourth knowledge from the second structured data and the fifth knowledge from the second semi-structured data; and the building module is configured to build the knowledge graph based on the first knowledge, the second knowledge, the third knowledge, the fourth knowledge and the fifth knowledge. 12 . The apparatus according to claim 11 , wherein the building module is configured to build a knowledge graph of the first sub-field based on the first knowledge, the fourth knowledge and the fifth knowledge; build a knowledge graph of the second sub-field based on the second knowledge and the third knowledge; combine the knowledge graph of the first sub-field and the knowledge graph of the second sub-field into a knowledge graph of the first sub-field and the second sub-field; wherein comparing attributes of an entity in the knowledge graph of the first sub-field with respective attributes of an entity in the knowledge graph of the second sub-field, determining similarity between the entity in the knowledge graph of the first sub-field and the entity in the knowledge graph of the second sub-field; combining the entity in the knowledge graph of the first sub-field and the entity in the knowledge graph of the second sub-field when the similarity is higher than a preset threshold.
Related publications grouped by family.
Answers are generated from the same data shown on this page.