What technology area does this patent fall under?

Primary CPC classification G06F16/93. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Apr 18 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Identification of changes between document versions

US11630869B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11630869-B2
Application number	US-202016806438-A
Country	US
Kind code	B2
Filing date	Mar 2, 2020
Priority date	Mar 2, 2020
Publication date	Apr 18, 2023
Grant date	Apr 18, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

One embodiment provides a method, including: obtaining at least two documents, wherein one of the at least two documents comprises a revision different than another of the at least two documents; identifying, within each of the at least two documents, portions corresponding to groups of text containing a conceptual unit; assigning at least a subset of the identified portions to a category type corresponding to a topic of a given portion, wherein the assigning comprises (i) generating a semantic tag for the identified portions in the subset and (ii) tagging the identified portions in the subset with the semantic tag; and determining changes between the at least two documents, wherein the determining comprises (iii) aligning given portions across the at least two documents based upon a relationship between the given portions across the at least two documents, (iv) identifying semantic differences between the aligned portions, and (v) identifying any remaining unaligned portions.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: obtaining at least two documents, wherein one of the at least two documents comprises a revision different than another of the at least two documents; identifying, within each of the at least two documents, processing units corresponding to contextually-related and positionally-connected groups of textual conceptual units, wherein the identifying comprises assigning each of the processing units to a category type and labelling each of the processing units with the category type; assigning at least a subset of the identified processing units to a category type corresponding to a topic of a given portion, wherein the assigning comprises (i) generating a semantic tag for each of the identified processing units in the subset and (ii) tagging each of the identified processing units in the subset with the semantic tag, wherein the semantic tag corresponds to a label of the category type for and identifies a topic of a given of the identified processing units; enriching the at least a subset of the identified processing units with custom attributes, wherein the custom attributes define areas of focus of change that correspond to processing units having changes that are to be identified as differences, wherein the custom attributes are defined in a dictionary form; and determining changes between the at least two documents, wherein the determining comprises (iii) aligning, utilizing rules learned using a classifier, given processing units across the at least two documents based upon a relationship between the given processing units across the at least two documents, (iv) for given processing units across the at least two documents having a custom attribute, identifying a change as a change and for given processing units across the at least two documents not having an associated custom attribute, identifying semantic differences between the aligned processing units, and (v) identifying any remaining unaligned processing units, wherein the aligning comprises identifying given processing units across the at least two documents having a same semantic tag, wherein changes between the at least two documents corresponding to changes not related to a target category are indicated as no change. 2. The method of claim 1 , comprising receiving, from a user, a query requesting identification of a change between the at least two documents related to a particular category type of interest. 3. The method of claim 2 , wherein the identifying is performed responsive to receiving the user query. 4. The method of claim 2 , wherein the generating a semantic tag is based upon terms included in the received query. 5. The method of claim 2 , comprising providing, responsive to the determining a change, a natural language identification of a change corresponding to the user query. 6. The method of claim 1 , comprising learning alignment rules by generating a decision tree classifier that is trained utilizing supervised data comprising a training set of (i) portions and (ii) a change status of the processing units; and wherein the defined rules are used in aligning the processing units across the at least two documents. 7. The method of claim 1 , comprising providing an explanation of the determined changes, the explanation identifying a rule used to determine a change. 8. The method of claim 1 , wherein the unaligned processing units are identified as at least one of: added processing units and removed processing units; and wherein the aligned processing units having semantic differences are identified as differences. 9. An apparatus, comprising: at least one processor; and a computer readable storage medium having computer readable program code embodied therewith and executable by the at least one processor, the computer readable program code comprising: computer readable program code configured to obtain at least two documents, wherein one of the at least two documents comprises a revision different than another of the at least two documents; computer readable program code configured to identify, within each of the at least two documents, processing units corresponding to contextually-related and positionally-connected groups of textual conceptual units, wherein the identifying comprises assigning each of the processing units to a category type and labelling each of the processing units with the category type; computer readable program code configured to assign at least a subset of the identified processing units to a category type corresponding to a topic of a given portion, wherein the assigning comprises (i) generating a semantic tag for each of the identified processing units in the subset and (ii) tagging each of the identified processing units in the subset with the semantic tag, wherein the semantic tag corresponds to a label of the category type for and identifies a topic of a given of the identified processing units; computer readable program code configured to enrich the at least a subset of the identified processing units with custom attributes, wherein the custom attributes define areas of focus of change that correspond to processing units having changes that are to be identified as differences, wherein the custom attributes are defined in a dictionary form; and computer readable program code configured to determine changes between the at least two documents, wherein the determining comprises (iii) aligning, utilizing rules learned using a classifier, given processing units across the at least two documents based upon a relationship between the given processing units across the at least two documents, (iv) for given processing units across the at least two documents having a custom attribute, identifying a change as a change and for given processing units across the at least two documents not having an associated custom attribute, identifying semantic differences between the aligned processing units, and (v) identifying any remaining unaligned processing units, wherein the aligning comprises identifying given processing units across the at least two documents having a same semantic tag, wherein changes between the at least two documents corresponding to changes not related to a target category are indicated as no change. 10. A computer program product, comprising: a computer readable storage medium having computer readable program code embodied therewith, the computer readable program code executable by a processor and comprising: computer readable program code configured to obtain at least two documents, wherein one of the at least two documents comprises a revision different than another of the at least two documents; computer readable program code configured to identify, within each of the at least two documents, processing units corresponding to contextually-related and positionally-connected groups of textual conceptual units, wherein the identifying comprises assigning each of the processing units to a category type and labelling each of the processing units with the category type; computer readable program code configured to assign at least a subset of the identified processing units to a category type corresponding to a topic of a given portion, wherein the assigning comprises (i) generating a semantic tag for each of the identified processing units in the subset and (ii) tagging each of the identified processing units in the subset with the semantic tag, wherein the semantic tag corresponds to a label of the category type for and identifies a topic of a given of the identified processing units; computer readable program code configured to enrich the at least a subset of the identified processing units with custom attributes, wherein the custom attributes define areas of focu

Assignees

Inventors

Classifications

G06F40/279
Recognition of textual entities · CPC title
G06F16/93Primary
Document management systems · CPC title
G06F40/30Primary
Semantic analysis · CPC title
G06F16/2474
Sequence data queries, e.g. querying versioned data · CPC title
G06N20/00
Machine learning · CPC title

Patent family

Related publications grouped by family.

View patent family 77463940

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11630869B2 cover?: One embodiment provides a method, including: obtaining at least two documents, wherein one of the at least two documents comprises a revision different than another of the at least two documents; identifying, within each of the at least two documents, portions corresponding to groups of text containing a conceptual unit; assigning at least a subset of the identified portions to a category type …
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G06F16/93. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Apr 18 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Embedding Natural Language Context in Structured Documents Using Document Anatomy

Classifying and ranking changes between document versions

Electronic content change tracking

Changes to documents are automatically summarized in electronic messages

Frequently asked questions