Methods and systems for managing chatbots with respect to rare entities

US11397857B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11397857-B2
Application numberUS-202016743810-A
CountryUS
Kind codeB2
Filing dateJan 15, 2020
Priority dateJan 15, 2020
Publication dateJul 26, 2022
Grant dateJul 26, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments for managing chatbots are provided. A set of documents is received. A plurality of entities are identified within the set of documents. At least one of the plurality of entities is selected based on a rareness criteria. Contextual data associated with each of the selected at least one of the plurality of entities is identified within the set of documents. At least one question-answer (QA) pair associated with each of the selected at least one of the plurality of entities is generated based on the identified contextual data.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method for managing a chatbot, by a processor, comprising: receiving a set of documents; identifying a plurality of entities within the set of documents, wherein the identifying of the plurality of entities includes determining a number of references to each of the plurality of entities within the set of documents; selecting at least one of the plurality of entities based on a rareness criteria, wherein the selecting of the at least one of the plurality of entities based on a rareness criteria includes: associating a first of the plurality of entities with a second of the plurality of entities based on a fuzzy matching algorithm, wherein the first of the plurality of entities has a first number of references in the set of documents, and the second of the plurality of entities has a second number of references in the set of documents; adding the second number of references to the first number of references to calculate a composite number of references for the first of the plurality of entities; and utilizing the composite number of references to determine if the first of the plurality of entities meets the rareness criteria; identifying contextual data associated with each of the selected at least one of the plurality of entities within the set of documents; and generating at least one question-answer (QA) pair associated with each of the selected at least one of the plurality of entities based on the identified contextual data. 2. The method of claim 1 , wherein the selecting of the at least one of the plurality of entities based on the rareness criteria includes selecting those of the plurality of entities for which the number of references within the set of documents is less than a first predetermined threshold and greater than a second predetermined threshold. 3. The method of claim 1 , further comprising providing a user interface that allows a user to modify the selecting of the at least one of the plurality of entities, and wherein the identifying of the contextual data associated with the selected at least one of the plurality of entities within the set of documents includes identifying contextual data associated with said modified at least one of the plurality of entities, and the generating of the at least one QA pair includes generating a QA pair associated with each of said modified at least one of the plurality of entities. 4. The method of claim 1 , further comprising causing a chatbot system to be trained utilizing the generated at least one QA pair. 5. The method of claim 1 , wherein each of the plurality of entities includes at least one of an individual, an object, and a location. 6. A system for managing a chatbot comprising: a processor executing instructions stored in a memory device, wherein the processor: receives a set of documents; identifies a plurality of entities within the set of documents, wherein the identifying of the plurality of entities includes determining a number of references to each of the plurality of entities within the set of documents; selects at least one of the plurality of entities based on a rareness criteria, wherein the selecting of the at least one of the plurality of entities based on a rareness criteria includes: associating a first of the plurality of entities with a second of the plurality of entities based on a fuzzy matching algorithm, wherein the first of the plurality of entities has a first number of references in the set of documents, and the second of the plurality of entities has a second number of references in the set of documents; adding the second number of references to the first number of references to calculate a composite number of references for the first of the plurality of entities; and utilizing the composite number of references to determine if the first of the plurality of entities meets the rareness criteria; identifies contextual data associated with each of the selected at least one of the plurality of entities within the set of documents; and generates at least one question-answer (QA) pair associated with each of the selected at least one of the plurality of entities based on the identified contextual data. 7. The system of claim 6 , wherein the selecting of the at least one of the plurality of entities based on the rareness criteria includes selecting those of the plurality of entities for which the number of references within the set of documents is less than a first predetermined threshold and greater than a second predetermined threshold. 8. The system of claim 6 , wherein the processor further provides a user interface that allows a user to modify the selecting of the at least one of the plurality of entities, and wherein the identifying of the contextual data associated with the selected at least one of the plurality of entities within the set of documents includes identifying contextual data associated with said modified at least one of the plurality of entities, and the generating of the at least one QA pair includes generating a QA pair associated with each of said modified at least one of the plurality of entities. 9. The system of claim 6 , wherein the processor further causes a chatbot system to be trained utilizing the generated at least one QA pair. 10. The system of claim 6 , wherein each of the plurality of entities includes at least one of an individual, an object, and a location. 11. A computer program product for managing a chatbot, by a processor, the computer program product embodied on a non-transitory computer-readable storage medium having computer-readable program code portions stored therein, the computer-readable program code portions comprising: an executable portion that receives a set of documents; an executable portion that identifies a plurality of entities within the set of documents, wherein the identifying of the plurality of entities includes determining a number of references to each of the plurality of entities within the set of documents; an executable portion that selects at least one of the plurality of entities based on a rareness criteria, wherein the selecting of the at least one of the plurality of entities based on a rareness criteria includes: associating a first of the plurality of entities with a second of the plurality of entities based on a fuzzy matching algorithm, wherein the first of the plurality of entities has a first number of references in the set of documents, and the second of the plurality of entities has a second number of references in the set of documents; adding the second number of references to the first number of references to calculate a composite number of references for the first of the plurality of entities; and utilizing the composite number of references to determine if the first of the plurality of entities meets the rareness criteria; an executable portion that identifies contextual data associated with each of the selected at least one of the plurality of entities within the set of documents; and an executable portion that generates at least one question-answer (QA) pair associated with each of the selected at least one of the plurality of entities based on the identified contextual data. 12. The computer program product of claim 11 , wherein the selecting of the at least one of the plurality of entities based on the rareness criteria includes selecting those of the plurality of entities for which the number of references within the set of documents is less than a first predetermined threshold and greater than a second predetermined threshold. 13. The computer program product of claim 11 , wherein the computer-readable program code portions furthe

Assignees

Inventors

Classifications

  • using statistical methods · CPC title

  • Discourse or dialogue representation · CPC title

  • Lexical analysis, e.g. tokenisation or collocates · CPC title

  • Natural language generation · CPC title

  • G06F40/295Primary

    Named entity recognition · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11397857B2 cover?
Embodiments for managing chatbots are provided. A set of documents is received. A plurality of entities are identified within the set of documents. At least one of the plurality of entities is selected based on a rareness criteria. Contextual data associated with each of the selected at least one of the plurality of entities is identified within the set of documents. At least one question-answe…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F40/295. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 26 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).