Ground Truth Collection via Browser for Passage-Question Pairings
US-2017154015-A1 · Jun 1, 2017 · US
US10599997B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10599997-B2 |
| Application number | US-201615234767-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 11, 2016 |
| Priority date | Aug 11, 2016 |
| Publication date | Mar 24, 2020 |
| Grant date | Mar 24, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for ground truth generation includes providing training questions to a machine learning system executing on a computer. The machine learning system generates candidate answers to each of the training questions. The method also includes providing the candidate answers to a plurality of subject matter experts for evaluation with respect to the training questions, wherein the evaluation comprises assignment of an SME relevance score to each of the candidate answers. The method further includes analyzing each of the candidate answers with respect to a plurality of scoring features, wherein each of the scoring features is indicative of quality of the candidate answer. The method yet further includes generating a ground truth metric value that indicates a measure of agreement between the subject matter experts relative to a measure of agreement between results of the analyzing.
Opening claim text (preview).
What is claimed is: 1. A method comprising: providing training questions to a machine learning system executing on a computer; producing, by the machine learning system: candidate answers to each of the training questions; and a system relevance score for each of the candidate answers; providing the candidate answers to a plurality of subject matter experts (SMEs) for evaluation with respect to the training questions, wherein the evaluation comprises assignment of an SME relevance score to each of the candidate answers; analyzing each of the candidate answers with respect to a plurality of scoring features by: generating, for each of the scoring features, a vector indicative of value of each of the scoring features in determining the quality of the candidate answer; and determining a distance between each two of the vectors, wherein each of the scoring features is indicative of quality of the candidate answer; and generating a ground truth metric value that indicates a measure of agreement between the subject matter experts relative to a measure of agreement between results of the analyzing, the ground truth metric being a ratio of average difference of subject matter expert vectors to average difference of subject matter expert vectors and candidate answer vectors. 2. The method of claim 1 , further comprising providing the ground truth metric to the SMEs, wherein the ground truth metric guides the SMEs as to quality of the evaluation of the candidate answers. 3. The method of claim 1 , further comprising excluding from the generating of the ground truth metric each subject matter expert vector that differs from a candidate answer vector by less than a predetermined amount. 4. The method of claim 1 , further comprising normalizing the system relevance scores to the SME relevance scores. 5. The method of claim 1 , wherein value of the ground truth metric increases with agreement between the SMEs and decreases with agreement between the SMEs and the candidate answers. 6. A system comprising: a machine learning system executed by a computer; a processor; and a memory coupled to the processor, the memory encoded with instructions that when executed cause the processor to provide a training system for training the machine learning system, the training system configured to: provide training questions to the machine learning system; retrieve, from the machine learning system: candidate answers to each of the training questions; and a system relevance score for each of the candidate answers; provide the candidate answers to a plurality of subject matter experts (SMEs) for evaluation with respect to the training questions, wherein the evaluation comprises assignment of an SME relevance score to each of the candidate answers; analyze each of the candidate answers with respect to a plurality of scoring features by: generating, for each of the scoring features, a vector indicative of value of each of the scoring features in determining the quality of the candidate answer; and determining a distance between each two of the vectors, wherein each of the scoring features is indicative of quality of the candidate answer; and generate a ground truth metric value that indicates a measure of agreement between the SMEs relative to a measure of agreement between results of the analyzing, the ground truth metric being a ratio of average difference of subject matter expert vectors to average difference of subject matter expert vectors and candidate answer vectors. 7. The system of claim 6 , wherein the training system is configured to provide the ground truth metric to the SMEs, wherein the ground truth metric guides the SMEs as to quality of the evaluation of the candidate answers. 8. The system of claim 6 , wherein the training system is configured to exclude from generation of the ground truth metric each subject matter expert vector that differs from a candidate answer vector by less than a predetermined amount. 9. The system of claim 6 , wherein the training system is configured to normalize the system relevance scores to the SME relevance scores. 10. The system of claim 6 , wherein value of the ground truth metric increases with agreement between the SMEs and decreases with agreement between the SMEs and the candidate answers. 11. A computer program product for training a machine learning system, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a computer to cause the computer to: provide training questions to the machine learning system; retrieve, from the machine learning system: candidate answers to each of the training questions; and a system relevance score for each of the candidate answers; provide the candidate answers to a plurality of subject matter experts (SMEs) for evaluation with respect to the training questions, wherein the evaluation comprises assignment of an SME relevance score to each of the candidate answers; analyze each of the candidate answers with respect to a plurality of scoring features, wherein each of the scoring features is indicative of quality of the candidate answer; normalize the system relevance scores to the SME relevance scores; for each of the scoring features, generate a vector indicative of value of each of the scoring features in determining the quality of the candidate answer; determine a distance between each two of the vectors; and generate a ground truth metric value that indicates a measure of agreement between the SMEs relative to a measure of agreement between results of the analyzing, the ground truth metric being a ratio of average difference of subject matter expert vectors to average difference of subject matter expert vectors and candidate answer vectors. 12. The computer program product of claim 11 , wherein the program instructions are executable by the computer to cause the computer to provide the ground truth metric to the SMEs, wherein the ground truth metric guides the SMEs with information indicative of quality of the evaluation. 13. The computer program product of claim 11 , wherein the program instructions are executable by the computer to cause the computer to exclude from generation of the ground truth metric each subject matter expert vector that differs from a candidate answer vector by less than a predetermined amount.
Related publications grouped by family.
Answers are generated from the same data shown on this page.