System and methods for analyzing documents
US-9805429-B2 · Oct 31, 2017 · US
US9589051B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9589051-B2 |
| Application number | US-201314371364-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 1, 2013 |
| Priority date | Feb 1, 2012 |
| Publication date | Mar 7, 2017 |
| Grant date | Mar 7, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Described herein are methods and systems for hierarchically mapping, ranking, and labeling data sets automatically. Also provided are methods for browsing and navigating a hierarchically mapped data set, and START identifying changes in network structure over time. An example method may involve receiving document data indicating a corpus of documents and references between documents within the corpus. Based on the document data, a network comprising two or more nodes and at least one directed edge may be determined. Also, a hierarchical partition of the documents may be determined based on the directed edges of the network. The hierarchical partition may define a plurality of nested modules, and each module in the plurality of nested modules may be associated with one or more respective documents within the corpus. The method may additionally include causing a graphical display to provide a visual indication of one or more of the plurality of nested modules.
Opening claim text (preview).
The invention claimed is: 1. A computer-implemented method, comprising: receiving document data indicating (i) a corpus of documents and (ii) references between documents within the corpus of documents; determining a network comprising (i) two or more nodes, wherein each node corresponds to a respective document in the corpus of documents, and (ii) at least one directed edge, wherein each directed edge connects two respective nodes, and wherein each directed edge corresponds to a reference between two documents in the corpus of documents; based on the directed edges of the network, determining a hierarchical partition of the documents, wherein the hierarchical partition defines a plurality of nested modules, and wherein each module in the plurality of nested modules is associated with one or more respective documents within the corpus of documents; causing a graphical display to provide a visual indication of one or more of the plurality of nested modules; determining a rank value associated with each respective document of the corpus based on the references between documents of the corpus of documents; determining for each of one or more respective modules of the plurality of nested modules: (i) a sample group of one or more documents of the module based on one or more rank values associated with the one or more documents of the module and (ii) a label for the module based on mutual content within the one or more documents of the sample group; and causing the graphical display to provide a visual indication of the label for each of the one or more respective modules of the plurality of nested modules. 2. The computer-implemented method of claim 1 , wherein each reference between documents within the corpus of documents is time-directed, and wherein the determined network is in a form of an acyclic directed graph. 3. The computer-implemented method of claim 1 , wherein the rank value associated with each respective document is determined based at least in part on a respective number of references to the respective document and the rank value associated with each of one or more documents referring to the respective document. 4. The computer-implemented method of claim 3 , further comprising: receiving document-selection data indicating a selection of a particular document; and causing the graphical display to provide a visual indication of a respective rank value associated with the particular document. 5. The computer-implemented method of claim 3 , further comprising: receiving document-selection data indicating a selection of a group of documents; determining a total rank value associated with the group of documents, wherein determining the total rank value comprises summing the rank value associated with each respective document of the group of documents; and causing the graphical display to provide a visual indication of the total rank value associated with the group of documents. 6. The computer-implemented method of claim 3 , further comprising: receiving document-selection data indicating an identification of a particular category; identifying one or more documents associated with the particular category; determining an ordered list of the one or more identified documents based on the rank value associated with each of the one or more identified documents; and causing the graphical display to provide a visual indication of the determined ordered list of the one or more identified documents. 7. The computer-implemented method of claim 3 , further comprising: receiving document-selection data indicating a selection of a particular module of the plurality of nested modules; determining an ordered list of one or more documents within the module based on the rank value associated with each of the one or more documents within the module; and causing a graphical display to provide a visual indication of the determined ordered list of one or more documents within the module. 8. The computer-implemented method of claim 3 , further comprising determining a monetary value associated with each of one or more respective documents of the corpus of documents based on the rank value associated with the document. 9. The computer-implemented method of claim 1 , further comprising: receiving document-selection data indicating a selection of a particular document; and causing the graphical display to provide a visual indication of a module comprising the particular document. 10. The computer-implemented method of claim 1 , further comprising: receiving document-selection data indicating a selection of a particular module of the plurality of nested modules; and causing the graphical display to provide a visual indication of one or more submodules associated with the particular module. 11. The computer-implemented method of claim 10 , wherein a respective size of each of the one or more submodules is proportional to a total respective rank value of documents within the submodule. 12. The computer-implemented method of claim 1 , wherein a respective size of each of the one or more modules in the visual indication of the one or more of the plurality of nested modules is proportional to a total respective rank value of documents within the module. 13. The computer-implemented method of claim 1 , wherein determining a hierarchical partition of the documents comprises determining a hierarchical partition that minimizes a hierarchical map equation, wherein the hierarchical map equation quantifies an average description length associated with modeling a process of flow on the network. 14. The computer-implemented method of claim 1 , wherein the received document data comprises document data associated with a first time period, and wherein the method further comprises: receiving partition data indicating a hierarchical partition of another corpus of documents associated with a second time period, wherein the hierarchical partition defines another plurality of nested modules, and wherein each module in the another plurality of nested modules is associated with one or more respective documents within the another corpus of documents; comparing (i) a difference between (a) a number of references to documents within a particular module associated with the first time period and (b) a number of references to documents within a corresponding module associated with the second time period to (ii) a threshold; and based on the comparison, causing the graphical display to provide a visual indication of the difference and the particular module. 15. The computer-implemented method of claim 1 , wherein documents of the corpus of documents comprise one or more of patent documents, scholarly documents, litigation documents, government documents, social media documents, online documents, magazine articles, and books. 16. The computer-implemented method of claim 1 , wherein documents of the corpus of documents comprise patent documents, and the method further comprising: associating metadata with one or more corresponding patent documents, wherein the metadata comprises an objective measurement of monetary value; and determining an objective measurement of monetary value corresponding to a given patent document based on the metadata associated with the one or more corresponding patent documents and the rank value associated with the one or more patent documents. 17. The computer-implemented method of claim 1 , wherein documents of the corpus of documents comprise patent documents, and the method further comprising: associating metadata with one or more corresponding patent document
Physics · mapped topic
Office automation; Time management · CPC title
Query execution (filtering based on additional data G06F16/335) · CPC title
using citations (hypermedia G06F16/94) · CPC title
Document management systems · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.