Method and apparatus for extracting portions of text from long social media documents
US-2015052120-A1 · Feb 19, 2015 · US
US2016171111A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016171111-A1 |
| Application number | US-201414572339-A |
| Country | US |
| Kind code | A1 |
| Filing date | Dec 16, 2014 |
| Priority date | Dec 16, 2014 |
| Publication date | Jun 16, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present teaching relates to providing structured text. In one example, a document is obtained. One or more keywords are identified in the document. One or more topics are determined based on the one or more keywords. Each of the one or more topics is related to at least one of the one or more keywords residing in one or more portions of the document. A snippet is generated for each of the portions associated with a corresponding topic based on content in the portion of the document.
Opening claim text (preview).
We claim: 1 . A method, implemented on a machine having at least one processor, storage, and a communication platform connected to a network for generating a snippet, the method comprising: obtaining a document; identifying one or more keywords in the document; determining one or more topics based on the one or more keywords, wherein each of the one or more topics is related to at least one of the one or more keywords residing in one or more portions of the document; and generating a snippet for each of the portions associated with a corresponding topic based on content in the portion of the document. 2 . The method of claim 1 , further comprising generating an index for each of the snippets based on the corresponding topic associated with the snippet. 3 . The method of claim 1 , wherein the determining comprises: matching each of the one or more keywords with the one or more topics; generating a score for each of the one or more topics based on the matching; and ranking the one or more topics based on their respective scores. 4 . The method of claim 1 , wherein the generating comprises: obtaining one or more parameters associated with the corresponding topic; extracting information from the portion of the document according to the one or more parameters; and generating the snippet based on the extracted information. 5 . The method of claim 1 , further comprising storing the snippet associated with the corresponding topic and the portion of the document in a database. 6 . The method of claim 1 , wherein the snippet is also associated with at least one of: the document; a URL (uniform resource locator) associated with the document; the portion of the document; one or more parameters representing a structure of the snippet; and a confidence score indicating how likely the snippet can represent the portion of the document. 7 . A method, implemented on a machine having at least one processor, storage, and a communication platform connected to a network for providing a search result, the method comprising: receiving a query; identifying one or more keywords from the query; determining one or more topics associated with the query based on the one or more keywords; retrieving one or more snippets based on the one or more topics, wherein each of the snippets corresponds to a portion of a corresponding document that is related to a topic associated with the snippet; and providing the one or more snippets in response to the query. 8 . The method of claim 7 , further comprising providing a representation of a corresponding document associated with each of the one or more snippets in response to the query. 9 . The method of claim 7 , wherein the determining comprises: matching each of the one or more keywords with the one or more topics; generating a score for each of the one or more topics based on the matching; and ranking the one or more topics based on their respective scores. 10 . The method of claim 7 , wherein the providing comprises: ranking the one or more snippets; and providing the ranked one or more snippets in response to the query. 11 . The method of claim 7 , wherein at least one of the one or more snippets is associated with at least one of: the corresponding document; a URL associated with the corresponding document; the portion of the corresponding document; one or more parameters representing a structure of the snippet; and a confidence score indicating how likely the snippet can represent the portion of the corresponding document. 12 . A system having at least one processor, storage, and a communication platform connected to a network for generating a snippet, comprising: a document obtaining unit configured for obtaining a document; an entity detector configured for identifying one or more keywords in the document; a use case matching unit configured for determining one or more topics based on the one or more keywords, wherein each of the one or more topics is related to at least one of the one or more keywords residing in one or more portions of the document; an indexed snippet generator configured for generating a snippet for each of the portions associated with a corresponding topic based on content in the portion of the document. 13 . The system of claim 12 , wherein the indexed snippet generator comprises an index generator configured for generating an index for each of the snippets based on the corresponding topic associated with the snippet. 14 . The system of claim 12 , wherein the use case matching unit is further configured for: matching each of the one or more keywords with the one or more topics; and generating a score for each of the one or more topics based on the matching, wherein the one or more topics are ranked based on their respective scores. 15 . The system of claim 12 , wherein the indexed snippet generator comprises: a snippet parameter determiner configured for obtaining one or more parameters associated with the corresponding topic; a structured text extractor configured for extracting information from the portion of the document according to the one or more parameters; and a snippet generator/updater configured for generating the snippet based on the extracted information. 16 . The system of claim 12 , wherein the snippet is also associated with at least one of: the document; a URL associated with the document; the portion of the document; one or more parameters representing a structure of the snippet; and a confidence score indicating how likely the snippet can represent the portion of the document. 17 . A system having at least one processor, storage, and a communication platform connected to a network for providing a search result, comprising: a search request analyzer configured for receiving a query; an entity type identifier configured for identifying one or more keywords from the query; a use case determiner configured for determining one or more topics associated with the query based on the one or more keywords; a snippet retriever configured for retrieving one or more snippets based on the one or more topics, wherein each of the snippets corresponds to a portion of a corresponding document that is related to a topic associated with the snippet; and a search result provider configured for providing the one or more snippets in response to the query. 18 . The system of claim 17 , wherein the search result provider is further configured for providing a representation of a corresponding document associated with each of the one or more snippets in response to the query. 19 . The system of claim 17 , wherein the use case determiner is further configured for: matching each of the one or more keywords with the one or more topics; generating a score for each of the one or more topics based on the matching; and ranking the one or more topics based on their respective scores. 20 . The system of claim 17 , further comprising a snippet ranking unit configured for ranking the one or more snippets. 21 . The system of claim 17 , wherein at least one of the one or more snippets is associated with at least one of: the corresponding document; a URL associated with the corresponding document; the portion of the corresponding document; one or more parameters representing a structure of the snippet; and a confidence score indicating how likely the snippet can represent the portion of the corresponding document. 22 .
Presentation of query results · CPC title
Search customisation based on user profiles and personalisation · CPC title
Indexing; Web crawling techniques · CPC title
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.