Complementary evidence identification in natural language inference

US12380343B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12380343-B2
Application numberUS-202016989866-A
CountryUS
Kind codeB2
Filing dateAug 10, 2020
Priority dateAug 10, 2020
Publication dateAug 5, 2025
Grant dateAug 5, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and apparatus for complementary evidence identification in natural language inference. A given question is obtained and a set of N passages is obtained from a database. A probability is determined, for each passage of the set of N passages, of a corresponding passage being a supportive passage for the given question and the set of N passages is ranked based on the determined probabilities. M passages that are ranked 1 to M of the set of N passages are selected. A set of L passages is selected based on a plurality of scores, each score assigned to a set of candidate passages of the set of N passages, each score being based on the determined probabilities, the selected M passages, and a weighted regulation parameter. The set of L passages is provided to a computerized machine learning system to answer the question based on the set of L passages.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for natural language inference, the method comprising: obtaining a given question for input to a hardware processor; obtaining, using the hardware processor, a set of N passages from an electronic database; determining, using the hardware processor, for each passage of the set of N passages, a probability of a corresponding passage being a supportive passage for the given question, including applying a BERT model to estimate the probability of the passage p; being supporting evidence to the given question q, where a concatenation of g and p; is input into the BERT model, and one or more hidden states of a last layer are used to represent q and p; in vector space, denoted as q and p i , respectively; wherein a fully connected layer f(⋅) followed by sigmoid activation is added to an end of the BERT model; and a scalar Prob(p i |q) is output to estimate a relevancy of passage p i to the given question q, wherein each passage of the set of N passages is designated as p j and the given question is designated as q; beam search ranking, using the hardware processor, the set of N passages based on the determined probabilities and a score function; selecting, using the hardware processor, M passages that are ranked 1 to M of the set of N passages, where M is a hyperparameter that corresponds to a beam size; selecting, using the hardware processor, a set of L passages based on a plurality of scores, each score assigned to a set of candidate passages of the set of N passages, each score being based on the determined probabilities, the selected M passages, and a weighted regulation parameter; providing, using the hardware processor, a set of M highest ranked passages of the set of L passages to a computerized machine learning system to answer the question based on the set of L passages; and answering the question with the computerized machine learning system, wherein the score function for finding a best passage is defined by a summation of: a summation of probabilities of a given passage given a specified question; a first hyperparameter multiplied by a cosine of a summation of multiplications of an encoded question vector and an encoded passage vector; and a second hyperparameter multiplied by a summation of losses between a first encoded passage vector and a second encoded passage vector. 2. The method of claim 1 , wherein the set of N passages comprises a mixture set of passages P=P + ∪P − with one or more passages p∈P + being relevant to the given question and one or more passages p∈P − being irrelevant to the given question, wherein each passage of the set of N passages is designated as p i . 3. The method of claim 1 , wherein: the selected set of L passages is designated as P sel and is similar to the given question q such that P sel has a probability of Σ p i ÅP sel Pr(p i |q) that is higher than an average probability for an unselected set of passages, wherein each passage of the set of N passages is designated as p i and the given question is designated as q; P sel covers all facts asked by the given question q such that a joint set of passages in P sel has a high similarity to the given question q and maximizes cos(Σ i∈{i|p i ∈P sel } p i , q); and P sel covers passages p i having diversity based on an average distance between any pair of passages p i in P sel . 4. The method of claim 3 , wherein the diversity is attained by maximizing Σ i,j∈{i,j|p i ,p j ∈P sel ,i≠j l 1 (p i , p j ) where l 1 (⋅,⋅) denotes an L 1 distance; and wherein coverage is attained by maximizing cos(Σ i∈{i|p i ∈P sel } p i ,q). 5. The method of claim 1 , further comprising training the BERT model using a supervised training objective function based on a set of labeled training examples where, for each training instance (q, P), {p i } + ={p i }, ∀ i ∈{i|p i ∈P + }; {p i } − ={p i }, ∀ i ∈{i|p i ∈P − }; and {p i }={p i } + ∪{p i } − are defined. 6. The method of claim 5 , wherein the supervised training objective function comprises a sum of a cross-entropy loss corresponding to a relevance condition that is at least one of a measure of a similarity and a measure of a dissimilarity between a question vector and each passage vector; a weighted cosine-embedding loss for a coverage condition that is a measure of a similarity between the question vector and a subspace spanned by one or more selected passage vectors; and a weighted regularization of a diversity condition that is a measure of an overall distance among supporting passage vectors. 7. The method of claim 6 , wherein the supervised training objective function is defined as: ℒ ⁡ ( { p i } ; q ; y ) = ℒ sup ⁡ ( { p i } ; q ; y ) + αℒ c ⁡ ( { p i } ; q ; y ) + βℒ d ⁡ ( { p i } + )

Assignees

Inventors

Classifications

  • Supervised learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Machine learning · CPC title

  • Natural language analysis (semantic analysis of natural language G06F40/30) · CPC title

  • for evaluating statistical data {, e.g. average values, frequency distributions, probability functions, regression analysis (forecasting specially adapted for a specific administrative, business or logistic context G06Q10/04)} · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12380343B2 cover?
Methods and apparatus for complementary evidence identification in natural language inference. A given question is obtained and a set of N passages is obtained from a database. A probability is determined, for each passage of the set of N passages, of a corresponding passage being a supportive passage for the given question and the set of N passages is ranked based on the determined probabiliti…
Who is the assignee on this patent?
IBM, Rensselaer Polytech Inst
What technology area does this patent fall under?
Primary CPC classification G06N5/04. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 05 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).