Index merge ordering
US-2015363470-A1 · Dec 17, 2015 · US
US11281645B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11281645-B2 |
| Application number | US-201815927124-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 21, 2018 |
| Priority date | Oct 28, 2015 |
| Publication date | Mar 22, 2022 |
| Grant date | Mar 22, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
According to an embodiment, a data management system includes an index building unit and a searching unit. The index building unit generates a peripheral vector similar to a case example vector representing a feature vector of data to be stored, and builds index information of enabling identification of the case example vector corresponding to the generated peripheral vector. The searching unit refers to the index information in response to a search request in which a query vector representing an arbitrary feature vector is specified, identifies the case example vector corresponding to the peripheral vector that exactly matches with the query vector, and outputs a search result based on the identified case example vector.
Opening claim text (preview).
What is claimed is: 1. A data management system comprising: circuitry configured to: perform index building by generating a peripheral vector similar to a case example vector representing a feature vector of data to be stored, and building index information of enabling identification of the case example vector corresponding to the generated peripheral vector; and performing searching in response to a search request in which a query vector representing an arbitrary feature vector is specified, by referring to the index information, by identifying the case example vector corresponding to the peripheral vector that exactly matches with the query vector, and by outputting a search result based on the identified case example vector, wherein the circuitry performs the index building by building the index information that contains a table which, as column elements, at least has a first column for storing the peripheral vector and a second column for storing information related to the case example vector corresponding to the peripheral vector, and an index with respect to the first column in the table, the circuitry performs the searching by using the index, obtaining such a record in the table which corresponds to the peripheral vector exactly matching with the query vector, and identifying the case example vector based on information stored in the second column of the obtained record, the circuitry performs the index building by building the index information that contains a first table which, as column elements, has a first column for storing the peripheral vector and a second column for storing degree of similarity of the peripheral vector with respect to the case example vector, a second table which, as column elements, has a first column for storing a row ID of a record of the first table and a second column for storing information related to the case example vector corresponding to the peripheral vector of the record, and a composite index with respect to the first column and the second column in the first table, and the circuitry performs the searching by using the composite index, obtaining, as a link, a row ID of such a record in the first table which corresponds to the peripheral vector exactly matching with the query vector and having the degree of similarity satisfying a condition, and identifying the case example vector based on information stored in the second column of such a record in the second table in which the obtained row ID is stored. 2. The data management system according to claim 1 , wherein, as data structure of the table, an associative array or a continuous memory arrangement array is used in which the peripheral vector stored in the first column is treated as a key and information stored in the second column is treated as a value. 3. The data management system according to claim 1 , wherein the circuitry performs the index building by building the index information that contains the table which, as column elements, has, in addition to having the first column and the second column, a third column for storing degree of similarity of the peripheral vector with respect to the case example vector, and a composite index with respect to the first column and the third column in the table, and the circuitry performs the searching by using the composite index, obtaining such a record in the table which corresponds to the peripheral vector exactly matching with the query vector and having the degree of similarity satisfying a condition, and identifying the case example vector based on information stored in the second column of the obtained record. 4. The data management system according to claim 3 , wherein, as data structure of the table, an associative array or a continuous memory arrangement array is used in which the peripheral vector stored in the first column and the degree of similarity stored in the third column are treated as keys and information stored in the second column are treated as a value. 5. The data management system according to claim 1 , wherein the circuitry performs the index building by building, as the index information, an index meant for searching for information related to the case example vector corresponding to the peripheral vector according to a value of the peripheral vector exactly matching with the query vector, and the circuitry performs the searching by using the index and identifying the case example vector corresponding to the peripheral vector that exactly matches with the query vector. 6. The data management system according to claim 1 , wherein the circuitry performs the index building by building, as the index information, a composite index meant for searching for information related to the case example vector corresponding to the peripheral vector according to a value of the peripheral vector exactly matching with the query vector and according to a degree of similarity of the peripheral vector with respect to the case example vector, and the circuitry performs the searching by using the composite index and identifying the case example vector corresponding to the peripheral vector that exactly matches with the query vector and that has the degree of similarity satisfying a condition. 7. The data management system according to claim 1 , wherein, when the search request includes specification of output count, the circuitry performs the searching by performing, while varying a condition of degree of similarity of the peripheral vector with respect to the case example vector in a phased manner from strict side, an operation of identifying the case example vector corresponding to the peripheral vector exactly matching with the query vector in a repeated manner until total number of the identified case example vector becomes equal to or greater than the output count, stopping the operation when total number of the identified case example vector becomes equal to or greater than the output count, and outputting a search result equal to a count close to the output count based on the identified case example vectors. 8. The data management system according to claim 1 , wherein the circuitry performs the index building by generating a contracted peripheral vector that is similar to a contracted case example vector formed by mapping the case example vector in a contracted vector space, and building the index information of enabling identification of the case example vector corresponding to the contracted peripheral vector that has been generated, and the circuitry performs the searching by identifying the case example vector corresponding to the contracted peripheral vector that exactly matches with a contracted query vector formed by mapping the query vector in a contracted vector space which is common with the index building. 9. The data management system according to claim 8 , wherein locality-sensitive hashing (LSH) technology is used in mapping of the case example vector to the contracted case example vector and in mapping of the query vector to the contracted query vector. 10. The data management system according to claim 9 , wherein, as the LSH technology, bitwise LSH is used in which the case example vector is converted into the contracted case example vector representing a binary vector with each dimension taking only binary value and in which the query vector is converted into the contracted query vector representing a binary vector with each dimension taking only binary value. 11. The data management system according to claim 10 , wherein the circuitry performs the index building by generating the contracted peripheral vector representing a binary vector and building the index inf
Trees, e.g. B+trees · CPC title
Hash tables · CPC title
Column-oriented storage; Management thereof · CPC title
Vectors, bitmaps or matrices · CPC title
Updating · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.