System and method for ground truth evaluation

US10599997B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10599997-B2
Application numberUS-201615234767-A
CountryUS
Kind codeB2
Filing dateAug 11, 2016
Priority dateAug 11, 2016
Publication dateMar 24, 2020
Grant dateMar 24, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for ground truth generation includes providing training questions to a machine learning system executing on a computer. The machine learning system generates candidate answers to each of the training questions. The method also includes providing the candidate answers to a plurality of subject matter experts for evaluation with respect to the training questions, wherein the evaluation comprises assignment of an SME relevance score to each of the candidate answers. The method further includes analyzing each of the candidate answers with respect to a plurality of scoring features, wherein each of the scoring features is indicative of quality of the candidate answer. The method yet further includes generating a ground truth metric value that indicates a measure of agreement between the subject matter experts relative to a measure of agreement between results of the analyzing.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: providing training questions to a machine learning system executing on a computer; producing, by the machine learning system: candidate answers to each of the training questions; and a system relevance score for each of the candidate answers; providing the candidate answers to a plurality of subject matter experts (SMEs) for evaluation with respect to the training questions, wherein the evaluation comprises assignment of an SME relevance score to each of the candidate answers; analyzing each of the candidate answers with respect to a plurality of scoring features by: generating, for each of the scoring features, a vector indicative of value of each of the scoring features in determining the quality of the candidate answer; and determining a distance between each two of the vectors, wherein each of the scoring features is indicative of quality of the candidate answer; and generating a ground truth metric value that indicates a measure of agreement between the subject matter experts relative to a measure of agreement between results of the analyzing, the ground truth metric being a ratio of average difference of subject matter expert vectors to average difference of subject matter expert vectors and candidate answer vectors. 2. The method of claim 1 , further comprising providing the ground truth metric to the SMEs, wherein the ground truth metric guides the SMEs as to quality of the evaluation of the candidate answers. 3. The method of claim 1 , further comprising excluding from the generating of the ground truth metric each subject matter expert vector that differs from a candidate answer vector by less than a predetermined amount. 4. The method of claim 1 , further comprising normalizing the system relevance scores to the SME relevance scores. 5. The method of claim 1 , wherein value of the ground truth metric increases with agreement between the SMEs and decreases with agreement between the SMEs and the candidate answers. 6. A system comprising: a machine learning system executed by a computer; a processor; and a memory coupled to the processor, the memory encoded with instructions that when executed cause the processor to provide a training system for training the machine learning system, the training system configured to: provide training questions to the machine learning system; retrieve, from the machine learning system: candidate answers to each of the training questions; and a system relevance score for each of the candidate answers; provide the candidate answers to a plurality of subject matter experts (SMEs) for evaluation with respect to the training questions, wherein the evaluation comprises assignment of an SME relevance score to each of the candidate answers; analyze each of the candidate answers with respect to a plurality of scoring features by: generating, for each of the scoring features, a vector indicative of value of each of the scoring features in determining the quality of the candidate answer; and determining a distance between each two of the vectors, wherein each of the scoring features is indicative of quality of the candidate answer; and generate a ground truth metric value that indicates a measure of agreement between the SMEs relative to a measure of agreement between results of the analyzing, the ground truth metric being a ratio of average difference of subject matter expert vectors to average difference of subject matter expert vectors and candidate answer vectors. 7. The system of claim 6 , wherein the training system is configured to provide the ground truth metric to the SMEs, wherein the ground truth metric guides the SMEs as to quality of the evaluation of the candidate answers. 8. The system of claim 6 , wherein the training system is configured to exclude from generation of the ground truth metric each subject matter expert vector that differs from a candidate answer vector by less than a predetermined amount. 9. The system of claim 6 , wherein the training system is configured to normalize the system relevance scores to the SME relevance scores. 10. The system of claim 6 , wherein value of the ground truth metric increases with agreement between the SMEs and decreases with agreement between the SMEs and the candidate answers. 11. A computer program product for training a machine learning system, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a computer to cause the computer to: provide training questions to the machine learning system; retrieve, from the machine learning system: candidate answers to each of the training questions; and a system relevance score for each of the candidate answers; provide the candidate answers to a plurality of subject matter experts (SMEs) for evaluation with respect to the training questions, wherein the evaluation comprises assignment of an SME relevance score to each of the candidate answers; analyze each of the candidate answers with respect to a plurality of scoring features, wherein each of the scoring features is indicative of quality of the candidate answer; normalize the system relevance scores to the SME relevance scores; for each of the scoring features, generate a vector indicative of value of each of the scoring features in determining the quality of the candidate answer; determine a distance between each two of the vectors; and generate a ground truth metric value that indicates a measure of agreement between the SMEs relative to a measure of agreement between results of the analyzing, the ground truth metric being a ratio of average difference of subject matter expert vectors to average difference of subject matter expert vectors and candidate answer vectors. 12. The computer program product of claim 11 , wherein the program instructions are executable by the computer to cause the computer to provide the ground truth metric to the SMEs, wherein the ground truth metric guides the SMEs with information indicative of quality of the evaluation. 13. The computer program product of claim 11 , wherein the program instructions are executable by the computer to cause the computer to exclude from generation of the ground truth metric each subject matter expert vector that differs from a candidate answer vector by less than a predetermined amount.

Assignees

Inventors

Classifications

  • G06N20/00Primary

    Machine learning · CPC title

  • G06N5/04Primary

    Inference or reasoning models · CPC title

  • G06N5/02Primary

    Knowledge representation; Symbolic representation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10599997B2 cover?
A method for ground truth generation includes providing training questions to a machine learning system executing on a computer. The machine learning system generates candidate answers to each of the training questions. The method also includes providing the candidate answers to a plurality of subject matter experts for evaluation with respect to the training questions, wherein the evaluation c…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06N20/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 24 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).