Corpus augmentation system
US-10031952-B2 · Jul 24, 2018 · US
US12299589B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12299589-B2 |
| Application number | US-202117363249-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 30, 2021 |
| Priority date | Jun 30, 2021 |
| Publication date | May 13, 2025 |
| Grant date | May 13, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
New question and answer (QA) pairs can be automatically discovered from a corpus of data such as online chats and conversations. Newly discovered QA pairs can augment QA database, which can be used by a computer processor or device, e.g., by a chatbot, an automated machine, and/or another. Existing QA knowledge can be used to learn the structures of QA knowledge distribution in conversations, and new QA knowledge can be automatically learned through the structure of learned QA knowledge distribution in conversations. The structure of learned QA knowledge distribution can be refined by adding more semantics based on labeled data.
Opening claim text (preview).
What is claimed is: 1. A system comprising: a processor; and a memory device coupled with the processor; the processor configured to, in performing operations of a chatbot having an on-line interactive chat conversation: receive a question and answer pair including a question and a corresponding answer; search a corpus of conversations to find first conversation segments containing the question and answer pair, the first conversation segments containing statements including the question and the corresponding answer, and intermediate statements in-between the question and the corresponding answer; tag the statements in the first conversation segments with dialog labels, the first conversation segments transformed into sequences of dialog labels, a sequence of dialog labels representing a question and answer structure pattern of a first conversation segment, wherein question and answer structure patterns are formed respectively corresponding to the first conversation segments; search the corpus of conversations to find second conversation segments having at least one of the question and answer structure patterns; for each of the question and answer structure patterns, receive labels associated with the second conversation segments; based on the received labels, compute effectiveness of each of the question and answer structure patterns; select a question and answer structure pattern meeting an effectiveness threshold; transform the selected question and answer structure pattern into a new question and answer; and conduct on-line chatbot conversation incorporating the new question and answer. 2. The system of claim 1 , wherein the processor is further configured to sample the second conversation segments found in the search to reduce the number of the second conversation segments for processing. 3. The system of claim 1 , wherein the received labels are manually labeled labels, indicating which of the second conversation segments include statements that follow the question and answer structure patterns. 4. The system of claim 1 , wherein the processor is further configured to refine at least one of the question and answer structure patterns by further analyzing at least one of the first conversation segment and the second conversation segment having said at least one of the question and answer structure patterns, and adding an intent associated with a dialog label contained in said at least one of the question and answer structure patterns. 5. The system of claim 4 , wherein the dialog label is randomly selected for adding the intent. 6. The system of claim 4 , wherein the processor is configured to refine said at least one of the question and answer structure patterns, which is below the effectiveness threshold. 7. The system of claim 6 , wherein the refined question and answer structure pattern eliminates at least one labeled conversation segment having said at least one of the question and answer structure patterns. 8. The system of claim 1 , wherein the received question and answer pair includes a predefined question and answer pair retrieved from a database storing manually defined question and answer pairs. 9. A computer-implemented method comprising: receiving a question and answer pair including a question and a corresponding answer in performing operations of a chatbot having an on-line interactive chat conversation; searching a corpus of conversations to find first conversation segments containing the question and answer pair, the first conversation segments containing statements including the question and the corresponding answer and intermediate statements in-between the question and the corresponding answer; tagging the statements in the first conversation segments with dialog labels, the first conversation segments transformed into sequences of dialog labels, a sequence of dialog labels representing a question and answer structure pattern of a first conversation segment, wherein question and answer structure patterns are formed respectively corresponding to the first conversation segments; searching the corpus of conversations to find second conversation segments having at least one of the question and answer structure patterns; for each of the question and answer structure patterns, receiving labels associated with the second conversation segments; based on the received labels, computing effectiveness of each of the question and answer structure patterns; selecting a question and answer structure pattern meeting an effectiveness threshold; and transforming the selected question and answer structure pattern into a new question and answer; and conducting on-line chatbot conversation incorporating the new question and answer. 10. The method of claim 9 , further including sampling the second conversation segments found in the search to reduce the number of the second conversation segments for processing. 11. The method of claim 9 , wherein the received labels are manually labeled labels, indicating which of the second conversation segments include statements that follow the question and answer structure patterns. 12. The method of claim 9 , further including refining at least one of the question and answer structure patterns by further analyzing at least one of the first conversation segment and the second conversation segment having said at least one of the question and answer structure patterns, and adding an intent associated with a dialog label contained in said at least one of the question and answer structure patterns. 13. The method of claim 12 , wherein the dialog label is randomly selected for adding the intent. 14. The method of claim 12 , wherein the refining includes refining said at least one of the question and answer structure patterns, which is below the effectiveness threshold. 15. The method of claim 14 , wherein the refined question and answer structure pattern eliminates at least one labeled conversation segment having said at least one of the question and answer structure patterns. 16. A computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions readable by a device to cause the device to: receive a question and answer pair including a question and a corresponding answer in performing operations of a chatbot having an on-line interactive chat conversation; search a corpus of conversations to find first conversation segments containing the question and answer pair, the conversation segments containing statements including the question and the answer and intermediate statements in-between the question and the corresponding answer; tag the statements in the first conversation segments with dialog labels, the first conversation segments transformed into sequences of dialog labels, a sequence of dialog labels representing a question and answer structure pattern of a first conversation segment, wherein question and answer structure patterns are formed respectively corresponding to the first conversation segments; search the corpus of conversations to find second conversation segments having at least one of the question and answer structure patterns; for each of the question and answer structure patterns, receive labels associated with the second conversation segments; based on the received labels, compute effectiveness of each of the question and answer structure patterns; select a one question and answer structure pattern meeting an effectiveness threshold; transform the selected question and answer structure pattern into a new question and answer; and conduct
Related publications grouped by family.
Answers are generated from the same data shown on this page.