System, method and computer program product for assessing the capabilities of a conversation agent via black box testing

US10102846B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10102846-B2
Application numberUS-201615338695-A
CountryUS
Kind codeB2
Filing dateOct 31, 2016
Priority dateOct 31, 2016
Publication dateOct 16, 2018
Grant dateOct 16, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A conversational agent capability assessment method, system, and computer program product, includes obtaining data to create at least one scenario for testing a conversational agent, performing a set of tests using a scenario of the at least one scenario created to assess a capability of the conversational agent, and comparing a result of the capability from the set of tests with an expected result of the scenario.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented conversational agent capability assessment method, the method comprising: obtaining data to create at least one scenario for testing a conversational agent in a conversational system having a non-deterministic output, the conversational agent being a software based on natural language classifiers; performing a set of tests using a scenario of the at least one scenario created to assess a capability of the conversational agent based on evaluation metrics generated from the set of tests while interacting with the conversational agent, the set of tests comprising testing a plurality of metrics including: a performance of the conversational agent; a linguistic variation; a personality type of the conversation agent; and a cognitive trait of the conversational agent; and comparing a result of the capability from the set of tests with an expected result of the scenario based on results from a different conversational agent, wherein the evaluation metrics are generated while interacting with the conversational system embodied in a cloud-computing environment. 2. The computer-implemented method of claim 1 , wherein each of the at least one created scenario comprises a user input to cause an aspect of the conversational agent to be tested. 3. The computer-implemented method of claim 1 , wherein the performing the set of tests comprises testing at least one of: a natural language ability of the conversational agent; pattern dialogue flow of the conversational agent; a response to an unexpected user input by the conversational agent; and a knowledge base of the conversational agent. 4. The computer-implemented method of claim 1 , wherein the expected result includes at least one of: a capability of a different conversational agent; and a capability of a different version of a same conversational agent being tested. 5. The computer-implemented method of claim 1 , wherein the at least one created scenario includes a speech-to-text and a text-to-speech conversion to assess the capability of the conversational agent for both of a textual input conversational agent and a speech input conversational agent. 6. The computer-implemented method of claim 1 , further comprising storing the result from the set of tests to be used for a future comparison by the comparing. 7. The computer-implemented method of claim 1 , further comprising generating a report of the result of the capability of the conversational agent for a human user. 8. The computer-implemented method of claim 7 , further comprising suggesting an action for the human user to modify the conversational agent based on the report of the result of the capability of the conversational agent. 9. The computer-implemented method of claim 1 , wherein the performing performs the set of tests by testing at least two of the plurality of testing metrics. 10. A computer program product for conversational agent capability assessment, the computer program product comprising a computer-readable storage medium having program instructions embodied therewith, the program instructions executable by a computer to cause the computer to perform: obtaining data to create at least one scenario for testing a conversational agent in a conversational system having a non-deterministic output, the conversational agent being a software based on natural language classifiers; performing a set of tests using a scenario of the at least one scenario created to assess a capability of the conversational agent based on evaluation metrics generated from the set of tests while interacting with the conversational agent, the set of tests comprising testing a plurality of metrics including: a performance of the conversational agent; a linguistic variation; a personality type of the conversation agent; and a cognitive trait of the conversational agent; and comparing a result of the capability from the set of tests with an expected result of the scenario based on results from a different conversational agent, wherein the evaluation metrics are generated while interacting with the conversational system embodied in a cloud-computing environment. 11. The computer program product of claim 10 , wherein each of the at least one created scenario comprises a user input to cause an aspect of the conversational agent to be tested. 12. The computer program product of claim 10 , wherein the performing the set of tests comprises testing at least one of: a natural language ability of the conversational agent; pattern dialogue flow of the conversational agent; a response to an unexpected user input by the conversational agent; and a knowledge base of the conversational agent. 13. The computer program product of claim 10 , wherein the expected result includes at least one of: a capability of a different conversational agent; and a capability of a different version of a same conversational agent being tested. 14. The computer program product of claim 10 , wherein the at least one created scenario includes a speech-to-text and a text-to-speech conversion to assess the capability of the conversational agent for both of a textual input conversational agent and a speech input conversational agent. 15. The computer program product of claim 10 , further comprising storing the result from the set of tests to be used for a future comparison by the comparing. 16. The computer program product of claim 10 , further comprising generating a report of the result of the capability of the conversational agent for a human user. 17. The computer program product of claim 16 , further comprising suggesting an action for the human user to modify the conversational agent based on the report of the result of the capability of the conversational agent. 18. A conversational agent capability assessment system, said system comprising: a processor; and a memory, the memory storing instructions to cause the processor to perform: obtaining data to create at least one scenario for testing a conversational agent in a conversational system having a non-deterministic output, the conversational agent being a software based on natural language classifiers; performing a set of tests using a scenario of the at least one scenario created to assess a capability of the conversational agent based on evaluation metrics generated from the set of tests while interacting with the conversational agent, the set of tests comprising testing a plurality of metrics including: a performance of the conversational agent; a linguistic variation; a personality type of the conversation agent; and a cognitive trait of the conversational agent; and comparing a result of the capability from the set of tests with an expected result of the scenario based on results from a different conversational agent, wherein the evaluation metrics are generated while interacting with the conversational system embodied in a cloud-computing environment. 19. The system of claim 18 , wherein each of the at least one created scenario comprises a user input to cause an aspect of the conversational agent to be tested.

Assignees

Inventors

Classifications

  • Distributed recognition, e.g. in client-server systems, for mobile phones or network applications · CPC title

  • by checking the correct order of processing (G06F11/08 - G06F11/26 take precedence; monitoring patterns of pulse trains H03K5/19) · CPC title

  • using natural language modelling · CPC title

  • for comparison or discrimination · CPC title

  • Speech synthesis; Text to speech systems · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10102846B2 cover?
A conversational agent capability assessment method, system, and computer program product, includes obtaining data to create at least one scenario for testing a conversational agent, performing a set of tests using a scenario of the at least one scenario created to assess a capability of the conversational agent, and comparing a result of the capability from the set of tests with an expected re…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G10L15/01. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 16 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).