Compiling Documents Into A Timeline Per Event

US2018067910A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2018067910-A1
Application numberUS-201615256924-A
CountryUS
Kind codeA1
Filing dateSep 6, 2016
Priority dateSep 6, 2016
Publication dateMar 8, 2018
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Representative embodiments disclose mechanisms to compile documents into a timeline document that tracks the evolution of a topic over time. Social media documents can be used to identify importance or popularity of linked documents (i.e., documents shared by social media in a post, tweet, etc.). A collection of social media documents is analyzed and used to identify a series of n-grams and a ranked list of linked documents. A subset of the ranked list is selected based upon similarity to the series of n-grams. The subset is then summarized and captured, along with underlying supporting data, into an entry of a timeline document. Related entries in different timeline documents can be linked to create a pivot point that allows a user to jump from one timeline to another. Timeline documents can be made available as part of a search performed by a query system.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method for capturing evolution of a topic over time as described by a set of documents, comprising: accessing a set of first documents collected over a period of time from at least one social media service; extracting, from the set of first documents, a contextual vector comprising a set of representative n-grams, the set of representative n-grams describing aspects of the first set of documents; identifying a set of second documents linked to by the set of first documents; ranking the set of second documents according to a selected criterion; selecting a subset of the set of second documents based on similarity to at least a portion of the contextual vector; and creating an entry into a timeline document, the entry comprising one or more of: the subset of the set of second documents; a point of view associated with at least a portion of the subset of the set of second documents; a title; a description; and original documents from the first set of documents. 2 . The method of claim 1 , wherein the contextual vector comprises a topic. 3 . The method of claim 1 further comprising calculating the point of view for each document in the subset of the set of second documents. 4 . The method of claim 3 wherein calculating the point of view comprises: identifying a sender of a document of the set of first documents; accessing a profile associated with the sender; identifying a point of view associated with the sender based on the profile; and associating the point of view with at least one document of the set of second documents. 5 . The method of claim 4 , further comprising: identifying a user that has interacted with the document of with the at least one document; identifying a point of view associated with the user; and associating the point of view with the at least one document. 6 . The method of claim 1 , further comprising: identify an entry point in the timeline document; selecting a second entry point in a target timeline document; calculating a similarity score based on metadata associated with the entry point and metadata associated with the second entry point; and adding a link between the entry point and the second entry point when the similarity score exceeds a threshold. 7 . The method of claim 1 , wherein selecting a subset of the set of second documents based on similarity to at least a portion of the contextual vector comprises: calculating a selection score for at least K documents of the second set of documents; and selecting as the subset of the set of second documents N documents having the highest selection scores of the second set of documents. 8 . The method of claim 1 , wherein extracting a contextual vector comprises: clustering the set of first documents according to at least one subject matter; identifying those clusters that have a number of documents over a threshold and, for each cluster so identified: extracting an n-gram for the cluster and storing the n-gram as part of the contextual vector; identifying a set of documents in the identified cluster from a cluster of individuals; clustering the set of documents to identify sub-topics within the set of documents, the clustering defining a set of sub-clusters; identifying sub-clusters of the set of sub-clusters that have a second number of documents over a second threshold and extracting a sub-cluster n-gram for each identified sub-cluster; and storing each sub-cluster n-gram as part of the contextual vector. 9 . A computing system comprising: a processor and executable instructions accessible on a machine-readable medium that, when executed, cause the system to perform operations comprising: accessing a set of first documents collected over a period of time from at least one social media service; extracting, from the set of first documents, a contextual vector comprising a set of representative n-grams, the set of representative n-grams describing aspects of the first set of documents; identifying a set of second documents linked to by the set of first documents; ranking the set of second documents according to a selected criterion; selecting the top K documents of the ranked set of second documents as a subset of the set of second documents; for each of the top K documents, calculating a selection score based on similarity to at least a portion of the contextual vector; selecting N documents from the top K documents based on the calculated selection score; and creating an entry into a timeline document, the entry comprising one or more of: at least a portion of the selected N documents; a point of view associated with at least a portion of the selected N documents; a title; a description; and original documents from the first set of documents. 10 . The system of claim 9 , wherein the contextual vector comprises a topic. 11 . The system of claim 9 , further comprising calculating the point of view for each of the selected N documents. 12 . The system of claim 11 , wherein calculating the point of view comprises: identifying a sender of a document of the set of first documents; access a profile associated with the sender; identify a point of view associated with the sender based on the profile; and associate the point of view with at least one document of the set of second documents. 13 . The system of claim 11 , wherein calculating the point of view comprises: identifying a user that has interacted with a document of the set of first documents; identify a point of view associated with the user, based on a profile associated with the user; and associate the point of view with at least one document of the set of second documents. 14 . The system of claim 11 , wherein calculating the point of view comprises: analyzing content of a document of the second set of documents; and based on keywords identified from the analyzing, identifying at least one point of view to associated with the document. 15 . The system of claim 9 , further comprising: identify an entry point in the timeline document; select a second entry point in a target timeline document; calculate a similarity score based on metadata associated with the entry point and metadata associated with the second entry point; and add a link between the entry point and the second entry point when the similarity score exceeds a threshold. 16 . The system of claim 9 , wherein extracting a contextual vector comprises: clustering the set of first documents according to at least one subject matter; identifying those clusters that have a number of documents over a threshold and, for each cluster so identified: extracting an n-gram for the cluster and storing the n-gram as part of the contextual vector; identifying a set of documents in the identified cluster from a cluster of individuals; clustering the set of documents to identify sub-topics within the set of documents, the clustering defining a set of sub-clusters; identify sub-clusters of the set of sub-clusters that have a second number of documents over a second threshold and extracting a sub-cluster n-gram for each identified sub-cluster; and storing each sub-cluster n-gram as part of the contextual vector. 17 . A machine-readable medium having executable instructions encoded thereon, which, when executed by at least one processor of a machine, cause the machine to perform operations comprising: access a set of first documents collected over a period of time from at least one social media service; extract, from the set of first

Assignees

Inventors

Classifications

  • User profiles · CPC title

  • using ranking · CPC title

  • Creation of semantic tools, e.g. ontology or thesauri · CPC title

  • Clustering or classification · CPC title

  • Recording time for administrative or management purposes · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2018067910A1 cover?
Representative embodiments disclose mechanisms to compile documents into a timeline document that tracks the evolution of a topic over time. Social media documents can be used to identify importance or popularity of linked documents (i.e., documents shared by social media in a post, tweet, etc.). A collection of social media documents is analyzed and used to identify a series of n-grams and a r…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F17/24. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Mar 08 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).