Indicators for navigating digital works
US-9158741-B1 · Oct 13, 2015 · US
US9613003B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-9613003-B1 |
| Application number | US-201213433028-A |
| Country | US |
| Kind code | B1 |
| Filing date | Mar 28, 2012 |
| Priority date | Sep 23, 2011 |
| Publication date | Apr 4, 2017 |
| Grant date | Apr 4, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In some implementations, text is extracted from a digital work and a plurality of noun phrases are identified. The noun phrases are checked against a network accessible resource, such as an online encyclopedia, that includes a plurality of interlinked article entries. The noun phrases that have corresponding entries in the network accessible resource are included in a set of candidate topics. The candidate topics are ranked based, at least in part, on the links to and from each of the entries corresponding to the candidate topics. Candidate topics below a ranking threshold are removed from the set of candidate topics. Further, term frequency information for each candidate topic in relation to the digital work is compared against term frequency information for the candidate topic in a large corpus of textual works to remove candidate topics within a frequency difference threshold.
Opening claim text (preview).
The invention claimed is: 1. One or more non-transitory computer-readable media maintaining instructions executable by one or more processors to perform operations comprising: extracting text from a digital work; identifying a plurality of noun phrases from the text extracted from the digital work; searching a network accessible resource having a plurality of entries to identify a set of one or more entries in the network accessible resource that contain information related to at least one noun phrase of the plurality of noun phrases, wherein each noun phrase corresponding to an entry in the set of one or more entries is a candidate topic in a set of candidate topics; ranking the candidate topics based, at least in part, on at least one of a number of incoming links or a number of outgoing links between each of the entries corresponding to the candidate topics; excluding, from the set of candidate topics, one or more candidate topics ranked below a first threshold; comparing a first term frequency-inverse document frequency (tf-idf) value with a second tf-idf value, wherein the first tf-idf value is determined with respect to the digital work for each candidate topic remaining in the set of candidate topics, and wherein the second tf-idf value is determined for the candidate topics with respect to a corpus of works; excluding, from the set of candidate topics, one or more candidate topics for which a difference between the first tf-idf value and the second tf-idf value is less than a second threshold; generating a digital supplemental information file comprising at least one reference to supplemental information relating to at least one candidate topic remaining in the set of candidate topics; receiving a request for the digital supplemental information file from an electronic device; and transmitting the digital supplemental information file to the electronic device, the digital supplemental information file to cause the digital work to include at least one selectable portion that enables display of the at least one reference to supplemental information and a visual representation of at least a location in the digital work of each occurrence of the at least one candidate topic remaining in the set of candidate topics, wherein the visual representation comprises an object with markings corresponding to each occurrence. 2. The one or more computer-readable media as recited in claim 1 , wherein the network accessible resource is at least one of: an online wiki-type site; an online encyclopedia site; an online dictionary site; a reference site; or a crowd-sourced information site. 3. The one or more computer-readable media as recited in claim 1 , prior to ranking the candidate topics, the operations further comprising, for each entry in the set of one or more entries performing at least one of: determining if one or more other entries in the set of one or more entries link to the entry; and determining if the entry links to one or more other entries in the set of one or more entries. 4. The one or more computer-readable media as recited in claim 1 , wherein the supplemental information comprises at least a location in the digital work of each occurrence of each candidate topic in the set of candidate topics. 5. The one or more computer-readable media as recited in claim 4 , wherein the supplemental information further comprises content related to at least one candidate topic obtained from the set of one or more entries in the network accessible resource. 6. A method comprising: under control of one or more processors configured with executable instructions, searching a network accessible resource for at least one entry corresponding to at least one noun phrase obtained from a digital work; identifying the at least one entry; generating a set of candidate topics from the at least one noun phrase corresponding to the at least one entry identified; for at least one candidate topic of the set of candidate topics: comparing a first indication of a frequency of the at least one candidate topic in the digital work with a second indication of a frequency of the at least one candidate topic in a corpus of digital works, and removing the at least one candidate topic from the set of candidate topics based, at least partly, on a difference between the first indication and the second indication being less than a threshold amount; generating a digital supplemental information file comprising at least one reference to supplemental information relating to at least one candidate topic remaining in the set of candidate topics; receiving a request for the digital supplemental information file from an electronic device; and transmitting the digital supplemental information file to the electronic device, the digital supplemental information file to cause the digital work to include at least one selectable portion that enables display of the at least one reference to supplemental information and a visual representation of at least a location in the digital work of each occurrence of the at least one candidate topic remaining in the set of candidate topics wherein the visual representation comprises an object with markings corresponding to each occurrence. 7. The method as recited in claim 6 , wherein the at least one entry in the network accessible resource contains information related to the at least one noun phrase. 8. The method as recited in claim 6 , wherein the network accessible resource is at least one of: an online wiki-type site; an online encyclopedia site; an online dictionary site; a reference site; or a crowd-sourced information site. 9. The method as recited in claim 6 , further comprising, for a plurality of noun phrases obtained from the digital work, identifying a plurality of entries in the network accessible resource, each entry corresponding to at least one noun phrase of the plurality of noun phrases. 10. The method as recited in claim 9 , wherein the respective noun phrases are each a respective candidate topic in the set of candidate topics, the method further comprising, for a particular entry of the network accessible resource that corresponds to a particular candidate topic performing at least one of: determining that one or more other entries corresponding to one or more other candidate topics link to the particular entry; and determining that the particular entry links to one or more other entries that correspond to one or more other candidate topics; and ranking the candidate topics in the set of candidate topics based, at least in part, on the links to and from the entries corresponding to the candidate topics. 11. The method as recited in claim 10 , wherein ranking the candidate topics is based, at least in part, on a page ranking of each of the entries corresponding to the candidate topics, wherein the page ranking is based, at least in part, on a probability of being directed to a particular entry of the entries corresponding to the candidate topics by selecting a random link in another entry corresponding to the candidate topics. 12. The method as recited in claim 10 , wherein ranking the candidate topics is based, at least in part, on at least one of: a number of incoming links or a number of outgoing links between each of the entries corresponding to the candidate topics. 13. The method as recited in claim 10 , further comprising, applying a ranking threshold to remove, from the set of candidate topics, one or more candidate topics ranked below the ranking threshold. 14. The method as recited in claim 13 , further comprising retaining or rejoining, in the set of candidate topics,
Marketing; Price estimation or determination; Fundraising · CPC title
Formatting, i.e. changing of presentation of documents (automatic justification G06F40/189; automatic line break hyphenation G06F40/191) · CPC title
Parsing markup language streams (streaming G06F40/149) · CPC title
using geographical or spatial information, e.g. location · CPC title
Presentation of query results · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.