Real-time content searching in social network
US-2016371388-A1 · Dec 22, 2016 · US
US2017103123A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2017103123-A1 |
| Application number | US-201615287257-A |
| Country | US |
| Kind code | A1 |
| Filing date | Oct 6, 2016 |
| Priority date | Oct 9, 2015 |
| Publication date | Apr 13, 2017 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A non-transitory computer-readable recording medium stores an index generating program that causes a computer to execute a process including: generating presence information of a plurality of pieces of text data, the presence information including whether each of a plurality of elements, included at least one of the plurality of pieces of text data, is present for each of the plurality of pieces of text data, the presence information including a first axe for the plurality of elements and a second axe for the plurality of pieces of text data; detecting collision data for hashed index information when generating the hashed index information, the collision data corresponding to data elements that are independent in the presence information; and setting additional values to each of a plurality of specific collision data, respectively, for one of the plurality of hashed axes.
Opening claim text (preview).
What is claimed is: 1 . A non-transitory computer-readable recording medium storing therein an index generating program that causes a computer to execute a process comprising: generating presence information of a plurality of pieces of text data, the presence information including whether each of a plurality of elements, included at least one of the plurality of pieces of text data, is present for each of the plurality of pieces of text data, the presence information including a first axe for the plurality of elements and a second axe for the plurality of pieces of text data; detecting collision data for hashed index information when generating the hashed index information, the hashed index information being generated from the presence information and including a plurality of hashed axes, the plurality of hashed axes being generated by applying a plurality of hash functions to the second axe of the presence information, the collision data corresponding to data elements that are independent in the presence information with the first axe and the second axe and duplicating in the hashed index information with the first axe and the plurality of hashed axes; and setting additional values to each of a plurality of specific collision data, respectively, for one of the plurality of hashed axes, the plurality of specific collision data being the detected collision data and satisfying a specific condition. 2 . The non-transitory computer-readable recording medium according to claim 1 , wherein the linking includes aggregating, when a collision continuously occurs in one of the plurality of hashed axes, the presence/absence ratio by using the presence information related to the element that is associated with the hashed axis in which the collisions have occurred, dividing, when the presence ratio of the aggregated presence/absence ratio is greater than a threshold, the presence information related to the element, and setting and linking a division destination to one of the plurality of hashed axes. 3 . The non-transitory computer-readable recording medium according to claim 2 , wherein the division destination used when the presence information related to the element is divided is an area of a low frequency word of the element. 4 . The non-transitory computer-readable recording medium according to claim 1 , wherein the size of the hashed axis is a number of bits matched with the size of a register. 5 . The non-transitory computer-readable recording medium according to claim 1 , wherein a unit of the plurality of elements is a unit of words. 6 . The non-transitory computer-readable recording medium according to claim 1 , wherein a unit of the plurality of elements is a unit of characters with an N grams (N is 2 or more). 7 . An index generating method comprising: generating presence information of a plurality of pieces of text data, the presence information including whether each of a plurality of elements, included at least one of the plurality of pieces of text data, is present for each of the plurality of pieces of text data, the presence information including a first axe for the plurality of elements and a second axe for the plurality of pieces of text data, by a processor; detecting collision data for hashed index information when generating the hashed index information, the hashed index information being generated from the presence information and including a plurality of hashed axes, the plurality of hashed axes being generated by applying a plurality of hash functions to the second axe of the presence information, the collision data corresponding to data elements that are independent in the presence information with the first axe and the second axe and duplicating in the hashed index information with the first axe and the plurality of hashed axes, by the processor; and setting additional values to each of a plurality of specific collision data, respectively, for one of the plurality of hashed axes, the plurality of specific collision data being the detected collision data and satisfying a specific condition, by the processor. 8 . An index generating device comprising: a processor that executes a process including: generating presence information of a plurality of pieces of text data, the presence information including whether each of a plurality of elements, included at least one of the plurality of pieces of text data, is present for each of the plurality of pieces of text data, the presence information including a first axe for the plurality of elements and a second axe for the plurality of pieces of text data; detecting collision data for hashed index information when generating the hashed index information, the hashed index information being generated from the presence information and including a plurality of hashed axes, the plurality of hashed axes being generated by applying a plurality of hash functions to the second axe of the presence information, the collision data corresponding to data elements that are independent in the presence information with the first axe and the second axe and duplicating in the hashed index information with the first axe and the plurality of hashed axes; and setting additional values to each of a plurality of specific collision data, respectively, for one of the plurality of hashed axes, the plurality of specific collision data being the detected collision data and satisfying a specific condition. 9 . A non-transitory computer-readable recording medium storing a search program that causes a computer to execute a process comprising: restoring, when receiving an element formed by two or more characters and identification information on text data, each of a plurality of hashed axes related to the received element; and searching for, based on presence information that is related to the element in each of a plurality of pieces of text data and that is indicated by each of bits in restored bit strings, the presence information on the element associated with the received identification information on the text data. 10 . A search method comprising: restoring, when receiving an element formed by two or more characters and identification information on text data, each of a plurality of hashed axes related to the received element, by a processor; and searching for, based on presence information that is related to the element in each of a plurality of pieces of text data and that is indicated by each of bits in restored bit strings, the presence information on the element associated with the received identification information on the text data, by the processor. 11 . A search device comprising: a processor that executes a process including: restoring, when receiving an element formed by two or more characters and identification information on text data, each of a plurality of hashed axes related to the received element; and searching, based on presence information that is related to the element in each of a plurality of pieces of text data and that is indicated by each of bits in bit strings restored at the restoring, the presence information on the element associated with the received identification information on the text data.
Hash tables · CPC title
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.