Data management system, data management method, and computer program product

US11281645B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11281645-B2
Application numberUS-201815927124-A
CountryUS
Kind codeB2
Filing dateMar 21, 2018
Priority dateOct 28, 2015
Publication dateMar 22, 2022
Grant dateMar 22, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

According to an embodiment, a data management system includes an index building unit and a searching unit. The index building unit generates a peripheral vector similar to a case example vector representing a feature vector of data to be stored, and builds index information of enabling identification of the case example vector corresponding to the generated peripheral vector. The searching unit refers to the index information in response to a search request in which a query vector representing an arbitrary feature vector is specified, identifies the case example vector corresponding to the peripheral vector that exactly matches with the query vector, and outputs a search result based on the identified case example vector.

First claim

Opening claim text (preview).

What is claimed is: 1. A data management system comprising: circuitry configured to: perform index building by generating a peripheral vector similar to a case example vector representing a feature vector of data to be stored, and building index information of enabling identification of the case example vector corresponding to the generated peripheral vector; and performing searching in response to a search request in which a query vector representing an arbitrary feature vector is specified, by referring to the index information, by identifying the case example vector corresponding to the peripheral vector that exactly matches with the query vector, and by outputting a search result based on the identified case example vector, wherein the circuitry performs the index building by building the index information that contains a table which, as column elements, at least has a first column for storing the peripheral vector and a second column for storing information related to the case example vector corresponding to the peripheral vector, and an index with respect to the first column in the table, the circuitry performs the searching by using the index, obtaining such a record in the table which corresponds to the peripheral vector exactly matching with the query vector, and identifying the case example vector based on information stored in the second column of the obtained record, the circuitry performs the index building by building the index information that contains a first table which, as column elements, has a first column for storing the peripheral vector and a second column for storing degree of similarity of the peripheral vector with respect to the case example vector, a second table which, as column elements, has a first column for storing a row ID of a record of the first table and a second column for storing information related to the case example vector corresponding to the peripheral vector of the record, and a composite index with respect to the first column and the second column in the first table, and the circuitry performs the searching by using the composite index, obtaining, as a link, a row ID of such a record in the first table which corresponds to the peripheral vector exactly matching with the query vector and having the degree of similarity satisfying a condition, and identifying the case example vector based on information stored in the second column of such a record in the second table in which the obtained row ID is stored. 2. The data management system according to claim 1 , wherein, as data structure of the table, an associative array or a continuous memory arrangement array is used in which the peripheral vector stored in the first column is treated as a key and information stored in the second column is treated as a value. 3. The data management system according to claim 1 , wherein the circuitry performs the index building by building the index information that contains the table which, as column elements, has, in addition to having the first column and the second column, a third column for storing degree of similarity of the peripheral vector with respect to the case example vector, and a composite index with respect to the first column and the third column in the table, and the circuitry performs the searching by using the composite index, obtaining such a record in the table which corresponds to the peripheral vector exactly matching with the query vector and having the degree of similarity satisfying a condition, and identifying the case example vector based on information stored in the second column of the obtained record. 4. The data management system according to claim 3 , wherein, as data structure of the table, an associative array or a continuous memory arrangement array is used in which the peripheral vector stored in the first column and the degree of similarity stored in the third column are treated as keys and information stored in the second column are treated as a value. 5. The data management system according to claim 1 , wherein the circuitry performs the index building by building, as the index information, an index meant for searching for information related to the case example vector corresponding to the peripheral vector according to a value of the peripheral vector exactly matching with the query vector, and the circuitry performs the searching by using the index and identifying the case example vector corresponding to the peripheral vector that exactly matches with the query vector. 6. The data management system according to claim 1 , wherein the circuitry performs the index building by building, as the index information, a composite index meant for searching for information related to the case example vector corresponding to the peripheral vector according to a value of the peripheral vector exactly matching with the query vector and according to a degree of similarity of the peripheral vector with respect to the case example vector, and the circuitry performs the searching by using the composite index and identifying the case example vector corresponding to the peripheral vector that exactly matches with the query vector and that has the degree of similarity satisfying a condition. 7. The data management system according to claim 1 , wherein, when the search request includes specification of output count, the circuitry performs the searching by performing, while varying a condition of degree of similarity of the peripheral vector with respect to the case example vector in a phased manner from strict side, an operation of identifying the case example vector corresponding to the peripheral vector exactly matching with the query vector in a repeated manner until total number of the identified case example vector becomes equal to or greater than the output count, stopping the operation when total number of the identified case example vector becomes equal to or greater than the output count, and outputting a search result equal to a count close to the output count based on the identified case example vectors. 8. The data management system according to claim 1 , wherein the circuitry performs the index building by generating a contracted peripheral vector that is similar to a contracted case example vector formed by mapping the case example vector in a contracted vector space, and building the index information of enabling identification of the case example vector corresponding to the contracted peripheral vector that has been generated, and the circuitry performs the searching by identifying the case example vector corresponding to the contracted peripheral vector that exactly matches with a contracted query vector formed by mapping the query vector in a contracted vector space which is common with the index building. 9. The data management system according to claim 8 , wherein locality-sensitive hashing (LSH) technology is used in mapping of the case example vector to the contracted case example vector and in mapping of the query vector to the contracted query vector. 10. The data management system according to claim 9 , wherein, as the LSH technology, bitwise LSH is used in which the case example vector is converted into the contracted case example vector representing a binary vector with each dimension taking only binary value and in which the query vector is converted into the contracted query vector representing a binary vector with each dimension taking only binary value. 11. The data management system according to claim 10 , wherein the circuitry performs the index building by generating the contracted peripheral vector representing a binary vector and building the index inf

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11281645B2 cover?
According to an embodiment, a data management system includes an index building unit and a searching unit. The index building unit generates a peripheral vector similar to a case example vector representing a feature vector of data to be stored, and builds index information of enabling identification of the case example vector corresponding to the generated peripheral vector. The searching unit…
Who is the assignee on this patent?
Toshiba Kk, Toshiba Digital Solutions Corp
What technology area does this patent fall under?
Primary CPC classification G06F16/2246. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 22 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).