Preventing a memory attack to a wireless access point
US-2015358346-A1 · Dec 10, 2015 · US
US2016140231A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016140231-A1 |
| Application number | US-201414546340-A |
| Country | US |
| Kind code | A1 |
| Filing date | Nov 18, 2014 |
| Priority date | Nov 18, 2014 |
| Publication date | May 19, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods, devices, and systems are described for creating and implementing search query vectors for knowledge base articles or other formal articles, the query vectors automatically created from informal correspondence such as a service request email to an information technology (IT) department. Term frequency-inverse document frequency (TF-IDF) scores are calculated for rarewords in the correspondence with respect to a corpus of other service requests. High scoring terms with the same neighbors as those in the corpus of formal articles are added to the search query vector, while high scoring terms that do not share the same neighbors are thrown out. The query vector is then used to run a search of the knowledge base for relevant articles.
Opening claim text (preview).
What is claimed is: 1 . A method for searching using term selection from a document to find similar content, the method comprising: providing formally written articles; selecting one or more tokens in each article by: identifying candidate root words; calculating, using a processor operatively coupled with a memory, a term frequency-inverse document frequency (TF-IDF) score for each of the candidate root words; and selecting the candidate root words as tokens based on the TF-IDF scores; cataloging neighboring tokens for each selected token into a data structure for each article, where neighboring tokens include tokens that are within a threshold number of words to the selected token in an article; merging the data structures for the articles into a merged data structure; providing a written correspondence; selecting one or more tokens in the correspondence by: identifying candidate root words from the correspondence; computing a TF-IDF score for each of the candidate root words in the correspondence with respect to a corpus of other correspondence; and selecting the candidate root words as tokens based on the TF-IDF scores; ascertaining neighboring tokens for each selected token in the correspondence; finding a match between a token in the correspondence and in the merged data structure; for the matched token, counting how many neighboring tokens in the merged data structure are also neighboring tokens in the correspondence; and adding the matched token to a query vector based on the counting; and performing a search of the formally written articles using the query vector. 2 . The method of claim 1 wherein the matched token is added to the query vector based on having a minimum threshold number of neighboring tokens in the merged data structure also being neighboring tokens in the correspondence, thereby excluding from the query vector high scoring terms in the correspondence that are specific to correspondence but not correlated among substantive, technical terms in formal written articles. 3 . The method of claim 1 further comprising: inserting a neighboring token from the merged data structure that is not a token in the correspondence, thereby expanding terms in the query vector beyond those that are in the correspondence. 4 . The method of claim 1 further comprising: returning search results based on the search. 5 . The method of claim 1 further comprising: building a data structure for the neighboring tokens in the correspondence, wherein the data structure for the neighboring tokens in the correspondence is of a same data type as the merged data structure. 6 . The method of claim 1 further comprising: tracking a minimum number of words between two tokens as a weight; and merging the data structures using the minimum number of words. 7 . The method of claim 1 further comprising: retaining a minimum number of words between two tokens when merging as a weight. 8 . The method of claim 1 wherein the selecting of tokens in each article, cataloging, and merging are performed before the written correspondence is provided. 9 . The method of claim 1 further comprising: calculating a logarithm of how many neighboring tokens in the data structure are also neighboring tokens in the correspondence; and adding the matched token to the query vector only if the logarithm is above a threshold value. 10 . The method of claim 1 wherein the neighboring tokens include tokens that are within 50 to 100 words of the selected token in an article. 11 . The method of claim 1 wherein the candidate root words are selected as tokens if they are above a transition point. 12 . The method of claim 1 wherein the candidate root words are selected as tokens if they are in a fourth quartile of scores. 13 . The method of claim 1 wherein the data structure includes an inverted index. 14 . The method of claim 1 wherein the correspondence includes an informal email. 15 . The method of claim 14 wherein the correspondence includes a service request for technical assistance. 16 . The method of claim 1 wherein the formally written articles include a knowledge base article. 17 . A machine-readable non-transitory medium embodying information indicative of instructions for causing one or more machines to perform operations for searching using term selection from a document to find similar content, the operations comprising: providing formally written articles; selecting one or more tokens in each article by: identifying candidate root words; calculating a term frequency-inverse document frequency (TF-IDF) score for each of the candidate root words; and selecting the candidate root words as tokens based on the TF-IDF scores; cataloging neighboring tokens for each selected token into a data structure for each article, where neighboring tokens include tokens that are within a threshold number of words to the selected token in an article; merging the data structures for the articles into a merged data structure; providing a written correspondence; selecting one or more tokens in the correspondence by: identifying candidate root words from the correspondence; computing a TF-IDF score for each of the candidate root words in the correspondence with respect to a corpus of other correspondence; and selecting the candidate root words as tokens based on the TF-IDF scores; ascertaining neighboring tokens for each selected token in the correspondence; finding a match between a token in the correspondence and in the merged data structure; for the matched token, counting how many neighboring tokens in the merged data structure are also neighboring tokens in the correspondence; and adding the matched token to a query vector based on the counting; and performing a search of the formally written articles using the query vector. 18 . The medium of claim 17 wherein the matched token is added to the query vector based on having a minimum threshold number of neighboring tokens in the merged data structure also being neighboring tokens in the correspondence, thereby excluding from the query vector high scoring terms in the correspondence that are specific to correspondence but not correlated among substantive, technical terms in formal written articles. 19 . A computer system executing instructions in a computer program for searching using term selection from a document to find similar content, the system comprising: a processor; and a memory operatively coupled with the processor, the processor executing instructions stored in the memory including: program code for providing formally written articles; program code for selecting one or more tokens in each article by: program code for identifying candidate root words; program code for calculating a term frequency-inverse document frequency (TF-IDF) score for each of the candidate root words; and program code for selecting the candidate root words as tokens based on the TF-IDF scores; program code for cataloging neighboring tokens for each selected token into a data structure for each article, where neighboring tokens include tokens that are within a threshold number of words to the selected token in an article; program code for merging the data structures for the articles into a common data structure; program code for providing a written correspondence; program code for selecting one or more tokens in the correspondence by: program code for identifying candidate root words from the correspondence; prog
Physics · mapped topic
Physics · mapped topic
Query execution (filtering based on additional data G06F16/335) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.