Method and apparatus for generating temporal knowledge graph, device, and medium

US12182724B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12182724-B2
Application numberUS-202017025952-A
CountryUS
Kind codeB2
Filing dateSep 18, 2020
Priority dateJan 15, 2020
Publication dateDec 31, 2024
Grant dateDec 31, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and apparatus for generating a temporal knowledge graph, a device and a medium. An embodiment comprises: acquiring corpus including time information; performing multivariate data extraction on the corpus, multivariate data including an entity pair, an entity relationship and a target time interval of the entity relationship, the target time interval being used to indicate a valid period of the entity relationship; and generating a temporal knowledge graph based on the entity pair, the entity relationship and the target time interval of the entity relationship.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method for generating a temporal knowledge graph, comprising: acquiring a corpus including time information from a source including a web page; performing multivariate data extraction on the corpus to obtain multivariate data, the multivariate data including an entity pair, an entity relationship and a target time interval of the entity relationship, the target time interval being used to indicate a valid period of the entity relationship; and generating the temporal knowledge graph based on the entity pair, the entity relationship and the target time interval of the entity relationship; wherein extracting the target time interval of the entity relationship comprises: obtaining a plurality of time intervals of the entity relationship through the multivariate data extraction; and performing fusion on the plurality of time intervals to obtain the target time interval, and wherein after the target time interval is obtained by performing fusion on the plurality of time intervals, the method further comprises: determining whether a null value exists at a time starting point and a time end point of the target time interval; in response to determining that the null value exists at the time starting point and the time end point of the target time interval, determining a validity of the null value using a candidate corpus from a source different from a source of the corpus; and in response to determining that the null value is invalid, replacing the null value with a time recognized from the candidate corpus. 2. The method according to claim 1 , wherein the performing fusion on the plurality of time intervals to obtain the target time interval comprises: screening the plurality of time intervals according to a confidence level of each time interval in the plurality of time intervals, to obtain screened time intervals; and integrating the screened time intervals in a chronological order, to obtain the target time interval. 3. The method according to claim 2 , wherein the screening the plurality of time intervals according to the confidence level of each time interval in the plurality of time intervals comprises: counting, in the corpus, a number of data sources corresponding to the each time interval in the plurality of time intervals; determining the confidence level of the each time interval according to the number; and screening the plurality of time intervals according to the confidence level. 4. The method according to claim 1 , wherein before performing multivariate data extraction on the corpus, the method further comprises: screening the corpus according to reliability of the source of the corpus, quality of text content and preset conditions of text topic type. 5. The method according to claim 1 , wherein the performing multivariate data extraction on the corpus comprises: performing characteristic extraction on each statement in the corpus to obtain an extracted characteristic of the each statement, by using a pre-trained characteristic extraction model; and classifying and annotating, based on the extracted characteristic of the each statement, a phrase in the each statement to obtain the multivariate data. 6. The method according to claim 5 , further comprising: training to obtain a multivariate data extraction model by using a training corpus set and an annotation result of multivariate data of each statement in the training corpus set, to perform the characteristic extraction and the classification and annotation using the multivariate data extraction model. 7. The method according to claim 1 , wherein the performing multivariate data extraction on the corpus comprises: analyzing a topic or a text structure of a text in the corpus; and in response to the topic of the text belonging to a preset topic or the text structure belonging to a preset text structure, extracting the entity relationship by using a preset relationship extraction approach. 8. The method according to claim 7 , wherein the extracting the entity relationship by using the preset relationship extraction approach comprises: extracting the entity relationship from a statement in the text according to the preset relationship extraction approach, the preset relationship extraction approach referring to a predefined approach for determining an entity relationship based on a knowledge extraction need; and obtaining the entity pair and the target time interval of the entity relationship by performing a characteristic extraction on the statement in the text and by classifying and annotating a word of the statement. 9. The method according to claim 1 , wherein after the performing multivariate data extraction on the corpus to obtain the multivariate data extraction, the method further comprises: disambiguating, according to a knowledge extraction need, any argument in the entity pair and the entity relationship, to obtain disambiguated entity pair and disambiguated entity relationship; and fusing the disambiguated entity pair and the disambiguated entity relationship. 10. The method according to claim 1 , wherein the acquiring a corpus including time information comprises: obtaining the corpus including the time information by recognizing the time information, wherein the time information includes time recorded in a body text of the corpus, push time of corpus data, update time of the corpus data, or time indirectly acquired based on a corpus source. 11. The method according to claim 1 , wherein the multivariate data is in a form of five-tuple data, including respectively a subject, an entity relationship, an object, a relationship validity time starting point, and a relationship failure time end point. 12. An electronic device, comprising: at least one processor; and a storage, wherein the storage stores at least one instruction that, when executed by the at least one processor, causes the at least one processor to perform operations, the operations comprising: acquiring a corpus including time information from a source including a web page; performing multivariate data extraction on the corpus to obtain multivariate data, the multivariate data including an entity pair, an entity relationship and a target time interval of the entity relationship, the target time interval being used to indicate a valid period of the entity relationship; and generating a temporal knowledge graph based on the entity pair, the entity relationship and the target time interval of the entity relationship, wherein extracting the target time interval of the entity relationship comprises: obtaining a plurality of time intervals of the entity relationship through the multivariate data extraction; and performing fusion on the plurality of time intervals to obtain the target time interval, and wherein after the target time interval is obtained by performing fusion on the plurality of time intervals, the operations further comprise: determining whether a null value exists at a time starting point and a time end point of the target time interval; in response to determining that the null value exists at the time starting point and the time end point of the target time interval, determining a validity of the null value using a candidate corpus from a source different from a source of the corpus; and in response to determining that the null value is invalid, replacing the null value with a time recognized from the candidate corpus. 13. The electronic device according to claim 12 , wherein the performing fusion on the plurality of time intervals to obtain the target time interval comprises: screening the plurality of time intervals accor

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12182724B2 cover?
A method and apparatus for generating a temporal knowledge graph, a device and a medium. An embodiment comprises: acquiring corpus including time information; performing multivariate data extraction on the corpus, multivariate data including an entity pair, an entity relationship and a target time interval of the entity relationship, the target time interval being used to indicate a valid perio…
Who is the assignee on this patent?
Beijing Baidu Netcom Sci & Tech Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06F16/367. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 31 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).