Generating information models in an in-memory database system
US-9519701-B2 · Dec 13, 2016 · US
US9378263B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9378263-B2 |
| Application number | US-201313860220-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 10, 2013 |
| Priority date | Jun 19, 2012 |
| Publication date | Jun 28, 2016 |
| Grant date | Jun 28, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods are provided for creating indices and loading key-value pairs for NoSQL databases. Attributes are created that correspond to records in a NoSQL database based on corresponding record fields. An index is created based on the attributes. A memory is loaded with attributes that correspond to a subset of the index as keys in a key-value pair and identifiers that correspond to records that correspond to the attributes as values in the key-value pair. The attributes that correspond to the subset of the index are sorted in the memory. Any duplicate attributes are identified from the sorted attributes in the memory. Any identifiers that correspond to any duplicate attributes also identify records in the NoSQL database to be evaluated as potential duplicate records.
Opening claim text (preview).
The invention claimed is: 1. An apparatus for creating indices and loading key-value pairs for NoSQL databases, the apparatus comprising: a processor; and one or more stored sequences of instructions which, when executed by the processor, cause the processor to carry out the steps of: creating a plurality of attributes that correspond to a plurality of records in a NoSQL database, wherein each attribute of the plurality of attributes comprises data from a corresponding plurality of record fields; creating an index based on the plurality of attributes; loading, in a memory, a plurality of attributes that correspond to a subset of the index as keys in a key-value pair and a plurality of identifiers that correspond to a plurality of records that correspond to the plurality of attributes as values in the key-value pair; sorting, in the memory, the plurality of attributes that correspond to the subset of the index; and identifying, in the memory, any duplicate attributes from the sorted plurality of attributes, wherein any identifiers that correspond to the any duplicate attributes also identify records in the NoSQL database to be evaluated as to whether the identified records are duplicates. 2. The apparatus of claim 1 , wherein the plurality of attributes that correspond to the plurality of records in a NoSQL database is based on an alphanumeric combination of the corresponding plurality of record fields and the index is based on an alphanumeric subset of the alphanumeric combination. 3. The apparatus of claim 1 , wherein the steps further comprise: determining whether to delete a record that is associated with a duplicate attribute; and deleting the record from the memory in response to a determination to delete the record associated with the duplicate attribute. 4. The apparatus of claim 1 , wherein the steps further comprise: determining whether to merge a plurality of records that are associated with a plurality of duplicate attributes; and merging the plurality of records in the memory in response to a determination to merge the plurality of records associated with the plurality of duplicate attributes. 5. The apparatus of claim 1 , wherein the steps further comprise: loading, in the memory, a plurality of attributes that correspond to the index as the keys in the key-value pair and a plurality of identifiers that correspond to a plurality of records that correspond to the plurality of attributes as the values in the key-value pair; receiving a request for at least one record that corresponds to an attribute of the plurality of attributes; and loading, in the memory, the at least one record based on an identifier that corresponds to the attribute of the plurality of attributes. 6. A non-transitory machine-readable medium carrying one or more sequences of instructions for creating indices and loading key-value pairs for NoSQL databases, which instructions, when executed by one or more processors, cause the one or more processors to carry out the steps of: creating a plurality of attributes that correspond to a plurality of records in a NoSQL database, wherein each attribute of the plurality of attributes comprises data from a corresponding plurality of record fields; creating an index based on the plurality of attributes; loading, in a memory, a plurality of attributes that correspond to a subset of the index as keys in a key-value pair and a plurality of identifiers that correspond to a plurality of records that correspond to the plurality of attributes as values in the key-value pair; sorting, in the memory, the plurality of attributes that correspond to the subset of the index; and identifying, in the memory, any duplicate attributes from the sorted plurality of attributes, wherein any identifiers that correspond to the any duplicate attributes also identify records in the NoSQL database to be evaluated as to whether the identified records are duplicates. 7. The non-transitory machine-readable medium of claim 6 , wherein the plurality of attributes that correspond to the plurality of records in a NoSQL database is based on an alphanumeric combination of the corresponding plurality of record fields and the index is based on an alphanumeric subset of the alphanumeric combination. 8. The non-transitory machine-readable medium of claim 6 , wherein the steps further comprise: determining whether to delete a record that is associated with a duplicate attribute; and deleting the record from the memory in response to a determination to delete the record associated with the duplicate attribute. 9. The non-transitory machine-readable medium of claim 6 , wherein the steps further comprise: determining whether to merge a plurality of records that are associated with a plurality of duplicate attributes; and merging the plurality of records in the memory in response to a determination to merge the plurality of records associated with the plurality of duplicate attributes. 10. The non-transitory machine-readable medium of claim 6 , wherein the steps further comprise: loading, in the memory, a plurality of attributes that correspond to the index as the keys in the key-value pair and a plurality of identifiers that correspond to a plurality of records that correspond to the plurality of attributes as the values in the key-value pair; receiving a request for at least one record that corresponds to an attribute of the plurality of attributes; and loading, in the memory, the at least one record based on an identifier that corresponds to the attribute of the plurality of attributes. 11. A method for creating indices and loading key-value pairs for NoSQL databases, the method comprising: creating a plurality of attributes that correspond to a plurality of records in a NoSQL database, wherein each attribute of the plurality of attributes comprises data from a corresponding plurality of record fields; creating an index based on the plurality of attributes; loading, in a memory, a plurality of attributes that correspond to a subset of the index as keys in a key-value pair and a plurality of identifiers that correspond to a plurality of records that correspond to the plurality of attributes as values in the key-value pair; sorting, in the memory, the plurality of attributes that correspond to the subset of the index; and identifying, in the memory, any duplicate attributes from the sorted plurality of attributes, wherein any identifiers that correspond to the any duplicate attributes also identify records in the NoSQL database to be evaluated as to whether the identified records are duplicates. 12. The method of claim 11 , wherein the plurality of attributes that correspond to the plurality of records in a NoSQL database is based on an alphanumeric combination of the corresponding plurality of record fields and the index is based on an alphanumeric subset of the alphanumeric combination. 13. The method of claim 11 , wherein the method further comprises: determining whether to delete a record that is associated with a duplicate attribute; and deleting the record from the memory in response to a determination to delete the record associated with the duplicate attribute. 14. The method of claim 11 , wherein the method further comprises: determining whether to merge a plurality of records that are associated with a plurality of duplicate attributes; and merging the plurality of records in the memory in response to a determination to merge the plurality of records associated with the plurality of duplicate attributes. 15. The method of claim 11 , wherein the method further comprises: loading
Physics · mapped topic
Physics · mapped topic
Physics · mapped topic
Indexing structures · CPC title
Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.