Classifying and ranking changes between document versions

US10713432B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10713432-B2
Application numberUS-201715476640-A
CountryUS
Kind codeB2
Filing dateMar 31, 2017
Priority dateMar 31, 2017
Publication dateJul 14, 2020
Grant dateJul 14, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

This disclosure generally covers systems and methods that identify and differentiate types of changes made from one version of a document to another version of the document. In particular, the disclosed systems and methods identify changes between different document versions as factual changes or paraphrasing changes or (in some embodiments) as changes of a more specific revision category. Moreover, in some embodiments, the disclosed systems and methods also generate a comparison of the first and second versions that identifies changes as factual changes or paraphrasing changes or (in some embodiments) as changes of a more specific revision category. The disclosed systems and methods, in some embodiments, further rank sentences that include changes made between different document versions or group similar (or the same) type of changes within a comparison of document versions.

First claim

Opening claim text (preview).

What is claimed is: 1. In a digital medium environment for generating comparisons of digital document versions, a computer-implemented method of identifying substantive and non-substantive changes between digital document versions comprising: mapping a first plurality of sentences of a first version of a document to a second plurality of sentences of a second version of the document; identifying changes between the first plurality of sentences and the second plurality of sentences as factual changes or paraphrasing changes by: applying a deterministic classification algorithm to a plurality of mapped-sentence combinations; and applying a supervised classification algorithm to unidentified mapped-sentence combinations of the plurality of mapped-sentence combinations, the unidentified mapped-sentence combinations comprising at least one change from the changes that the deterministic classification algorithm did not identify; generating, for display within a graphical user interface, a document comparison of the first version and the second version that identifies the changes as factual changes or paraphrasing changes, the document comparison comprising a factual-change filter for identifying factual changes and a paraphrasing-change filter for identifying paraphrasing changes; and based on receiving an indication of a selection of the factual-change filter or the paraphrasing-change filter, modifying the document comparison to identify factual changes using a first type of markings for changed text or paraphrasing changes using a second type of markings for changed text. 2. The method of claim 1 , further comprising determining a relative importance of revised sentences of the second plurality of sentences, wherein the revised sentences include a minimum of one change of the changes. 3. The method of claim 2 , wherein determining the relative importance of the revised sentences comprises generating a composite-importance score for each of the revised sentences based on a change-importance score and a sentence-importance score for each of the revised sentences. 4. The method of claim 1 , wherein mapping the first plurality of sentences of the first version to the second plurality of sentences of the second version comprises: mapping one or more null sentences of the first plurality of sentences to one or more sentences of the second plurality of sentences; or mapping one or more sentences of the first plurality of sentences to one or more null sentences of the second plurality of sentences. 5. The method of claim 1 , further comprising: identifying a plurality of similar changes to a repeated phrase from among the changes between the first plurality of sentences and the second plurality of sentences; and wherein generating the document comparison of the first version and the second version comprises grouping together the plurality of similar changes to the repeated phrase for display within the document comparison. 6. A non-transitory computer readable medium storing instructions thereon that, when executed by at least one processor, cause a computing device to: map a first plurality of sentences of a first version of a document to a second plurality of sentences of a second version of the document; identify changes between the first plurality of sentences and the second plurality of sentences as factual changes or paraphrasing changes by: applying a deterministic classification algorithm to a plurality of mapped-sentence combinations; and applying a supervised classification algorithm to unidentified mapped-sentence combinations of the plurality of mapped-sentence combinations, the unidentified mapped-sentence combinations comprising at least one change from the changes that the deterministic classification algorithm did not identify; generate, for display within a graphical user interface, a document comparison of the first version and the second version that identifies the changes as factual changes or paraphrasing changes, the document comparison comprising a factual-change filter for identifying factual changes and a paraphrasing-change filter for identifying paraphrasing changes; and based on receiving an indication of a selection of the factual-change filter or the paraphrasing-change filter, modify the document comparison to identify factual changes using a first type of markings for changed text or paraphrasing changes using a second type of markings for changed text. 7. The non-transitory computer readable medium of claim 6 , further comprising instructions that, when executed by the at least one processor, cause the computing device to apply the deterministic classification algorithm to the plurality of mapped-sentence combinations by: identifying the plurality of mapped-sentence combinations, wherein each of the plurality of mapped-sentence combinations includes one or more sentences of the first plurality of sentences that is mapped to one or more sentences of the second plurality of sentences; and applying the deterministic classification algorithm to the plurality of mapped-sentence combinations to identify at least one change from the changes as part of a revision category of a plurality of revision categories. 8. The non-transitory computer readable medium of claim 7 , further comprising instructions that, when executed by the at least one processor, cause the computing device to identify the at least one change from the changes as part of the revision category of the plurality of revision categories by identifying the at least one change as part of: an information-insert category for changes that insert information; an information-delete category for changes that delete information; an information-modify category for changes that modify information; a lexical-paraphrase category for changes that replace a term or phrase with a synonym or that modify a style of terms; and a transformational-paraphrase category for changes that reorder terms or phrases. 9. The non-transitory computer readable medium of claim 6 , further comprising instructions that, when executed by the at least one processor, cause the computing device to apply the deterministic classification algorithm to the plurality of mapped-sentence combinations by: assigning a part-of-speech tag of a plurality of part-of-speech tags to each term within the first plurality of sentences and to each term within the second plurality of sentences; and assigning a named-entity tag of a plurality of named-entity tags to terms within the first plurality of sentences and to terms within the second plurality of sentences. 10. The non-transitory computer readable medium of claim 6 , further comprising instructions that, when executed by the at least one processor, cause the computing device to apply the deterministic classification algorithm to the plurality of mapped-sentence combinations by: identifying a first part-of-speech sequence representing a first sentence of the first plurality of sentences and a first additional part-of-speech sequence representing a first additional sentence of the second plurality of sentences, wherein the first part-of-speech sequence comprises each part-of-speech tag assigned to each term within the first sentence and the first additional part-of-speech sequence comprises each part-of-speech tag assigned to each term within the first additional sentence; identifying a second part-of-speech sequence representing a second sentence of the first plurality of sentences and a second additional part-of-speech sequence representing a second additional sentence of the second plurality of sentences, wherein the second part-of-speech sequence comprises each part-of-speech tag assigned to each term within the

Assignees

Inventors

Classifications

  • Entity relationship models · CPC title

  • G06F40/194Primary

    Calculation of difference between files · CPC title

  • Version control (for software G06F8/71) · CPC title

  • Semantic analysis · CPC title

  • Tagging; Marking up (details of markup languages G06F40/143); Designating a block; Setting of attributes (style sheets, e.g. eXtensible Stylesheet Language Transformation [XSLT], G06F40/154) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10713432B2 cover?
This disclosure generally covers systems and methods that identify and differentiate types of changes made from one version of a document to another version of the document. In particular, the disclosed systems and methods identify changes between different document versions as factual changes or paraphrasing changes or (in some embodiments) as changes of a more specific revision category. More…
Who is the assignee on this patent?
Adobe Inc
What technology area does this patent fall under?
Primary CPC classification G06F40/194. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 14 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).