Linguistic extraction of temporal and location information for a recommender system
US-2024169375-A1 · May 23, 2024 · US
US10114889B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10114889-B2 |
| Application number | US-201314411465-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 15, 2013 |
| Priority date | Jun 27, 2012 |
| Publication date | Oct 30, 2018 |
| Grant date | Oct 30, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques for filtering information are described herein. In accordance with the present disclosure, a text acquisition module is configured to acquire text content to be filtered and a scanning module is configured to scan the text content to be filtered. The disclosed techniques scan the text content through a preset keyword dictionary, record a position of each keyword in the text content and acquire character pitch between keywords in the text content according to the position of each keyword in text content. A pitch judgment module is configured to judge whether the character pitch exceeds a preset character pitch and filter the keyword(s) in the text content in response to a determination that the character pitch exceeds the preset character pitch.
Opening claim text (preview).
The invention claimed is: 1. An improved information filtering system for filtering out sensitive information from content, which comprises: a processor; and a memory communicatively coupled to the processor and storing instructions that upon execution by the processor cause the system to: acquire text content; scan the text content through a preset keyword dictionary; in response to a determination that the text content contains a plurality of keywords stored in the preset keyword dictionary, determine a position of each of the plurality of keywords in the text content; determine at least one character pitch between any two of the plurality of keywords in the text content based on the position of each keyword among the plurality of keywords, wherein the at least one character pitch is a difference between positions of any two of the plurality of keywords in the text content; determine whether the at least one character pitch does not exceed a preset character pitch; in response to a determination that the at least one character pitch does not exceed the preset character pitch, filter out the plurality of keywords from the text content; wherein the preset keyword dictionary further stores a preset order of at least two keywords among all of the keywords that need to be filtered out; and wherein the memory further stores instructions that upon execution by the processor cause the system to: determine the order of the plurality of keywords according to the position of each keyword among the plurality of keywords in the text content, compare the order of the plurality of keywords in the text content with the preset order of corresponding keywords stored in the keyword dictionary, and when the order of the plurality of keywords in the text content matches the preset order of the corresponding keywords stored in the keyword dictionary, determine that the plurality of keywords satisfy the preset order. 2. The system according to claim 1 , wherein the plurality of keywords are words constituting sensitive information and the preset keyword dictionary stores all of keywords that need to be filtered out. 3. The system according to claim 1 , wherein the memory further stores instructions that upon execution by the processor cause the system to use a network spider to capture a web page to acquire the text content. 4. The system according to claim 1 , wherein the memory further stores instructions that upon execution by the processor cause the system to acquire the text content by means of receiving the text content. 5. A method for improving sensitive information filtering, which comprises steps of: acquiring text content; scanning the text content through a preset keyword dictionary; in response to a determination that the text content contains a plurality of keywords stored in the preset keyword dictionary, determining a position of each of the plurality of keywords in the text content; determining at least one character pitch between any two of the plurality of keywords in the text content based on the position of each keyword among the plurality of keywords, wherein the at least one character pitch is a difference between positions of any two of the plurality of keywords in the text content; determining whether the at least one character pitch does not exceed a preset character pitch; in response to a determination that the at least one character pitch does not exceed the preset character pitch, filtering out the plurality of keywords from the text content; wherein the preset keyword dictionary further stores a preset order of at least two keywords among all of the keywords that need to be filtered out; and wherein the method further comprises: determining the order of the plurality of keywords according to the position of each keyword among the plurality of keywords in the text content, comparing the order of the plurality of keywords in the text content with the preset order of corresponding keywords stored in the keyword dictionary, and when the order of the plurality of keywords in the text content matches the preset order of the corresponding keywords stored in the keyword dictionary, determining that the plurality of keywords satisfy the preset order. 6. The method according to claim 5 , wherein the plurality of keywords are words constituting sensitive information and the preset keyword dictionary stores all of keywords that need to be filtered out. 7. The method according to claim 5 , wherein using a network spider to capture a web page to acquire the text content. 8. The method according to claim 5 , wherein acquiring the text content by means of receiving the text content. 9. A non-transitory computer readable medium having instructions stored thereon that, when executed by at least one processor, cause the at least one processor to perform operations for filtering keywords, the operations comprising: acquiring text content; scanning the text content through a preset keyword dictionary; in response to a determination that the text content contains a plurality of keywords stored in the preset keyword dictionary, determining a position of each of the plurality of keywords in the text content; determining at least one character pitch between any two of the plurality of keywords in the text content based on the position of each keyword among the plurality of keywords, wherein the at least one character pitch is a difference between positions of any two of the plurality of keywords in the text content; determining whether the at least one character pitch does not exceed a preset character pitch; in response to a determination that the at least one character pitch does not exceed the preset character pitch, filtering out the plurality of keywords from the text content; wherein the preset keyword dictionary further stores a preset order of at least two keywords among all of the keywords that need to be filtered out; and wherein the operations further comprises: determining the order of the plurality of keywords according to the position of each keyword among the plurality of keywords in the text content, comparing the order of the plurality of keywords in the text content with the preset order of corresponding keywords stored in the keyword dictionary, and when the order of the plurality of keywords in the text content matches the preset order of the corresponding keywords stored in the keyword dictionary, determining that the plurality of keywords satisfy the preset order.
Filtering based on additional data, e.g. user or group profiles (filtering in web context G06F16/9535, G06F16/9536) · CPC title
Indexing; Web crawling techniques · CPC title
Physics · mapped topic
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.