Method and system for creating indices and loading key-value pairs for NoSQL databases

US9378263B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9378263-B2
Application numberUS-201313860220-A
CountryUS
Kind codeB2
Filing dateApr 10, 2013
Priority dateJun 19, 2012
Publication dateJun 28, 2016
Grant dateJun 28, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods are provided for creating indices and loading key-value pairs for NoSQL databases. Attributes are created that correspond to records in a NoSQL database based on corresponding record fields. An index is created based on the attributes. A memory is loaded with attributes that correspond to a subset of the index as keys in a key-value pair and identifiers that correspond to records that correspond to the attributes as values in the key-value pair. The attributes that correspond to the subset of the index are sorted in the memory. Any duplicate attributes are identified from the sorted attributes in the memory. Any identifiers that correspond to any duplicate attributes also identify records in the NoSQL database to be evaluated as potential duplicate records.

First claim

Opening claim text (preview).

The invention claimed is: 1. An apparatus for creating indices and loading key-value pairs for NoSQL databases, the apparatus comprising: a processor; and one or more stored sequences of instructions which, when executed by the processor, cause the processor to carry out the steps of: creating a plurality of attributes that correspond to a plurality of records in a NoSQL database, wherein each attribute of the plurality of attributes comprises data from a corresponding plurality of record fields; creating an index based on the plurality of attributes; loading, in a memory, a plurality of attributes that correspond to a subset of the index as keys in a key-value pair and a plurality of identifiers that correspond to a plurality of records that correspond to the plurality of attributes as values in the key-value pair; sorting, in the memory, the plurality of attributes that correspond to the subset of the index; and identifying, in the memory, any duplicate attributes from the sorted plurality of attributes, wherein any identifiers that correspond to the any duplicate attributes also identify records in the NoSQL database to be evaluated as to whether the identified records are duplicates. 2. The apparatus of claim 1 , wherein the plurality of attributes that correspond to the plurality of records in a NoSQL database is based on an alphanumeric combination of the corresponding plurality of record fields and the index is based on an alphanumeric subset of the alphanumeric combination. 3. The apparatus of claim 1 , wherein the steps further comprise: determining whether to delete a record that is associated with a duplicate attribute; and deleting the record from the memory in response to a determination to delete the record associated with the duplicate attribute. 4. The apparatus of claim 1 , wherein the steps further comprise: determining whether to merge a plurality of records that are associated with a plurality of duplicate attributes; and merging the plurality of records in the memory in response to a determination to merge the plurality of records associated with the plurality of duplicate attributes. 5. The apparatus of claim 1 , wherein the steps further comprise: loading, in the memory, a plurality of attributes that correspond to the index as the keys in the key-value pair and a plurality of identifiers that correspond to a plurality of records that correspond to the plurality of attributes as the values in the key-value pair; receiving a request for at least one record that corresponds to an attribute of the plurality of attributes; and loading, in the memory, the at least one record based on an identifier that corresponds to the attribute of the plurality of attributes. 6. A non-transitory machine-readable medium carrying one or more sequences of instructions for creating indices and loading key-value pairs for NoSQL databases, which instructions, when executed by one or more processors, cause the one or more processors to carry out the steps of: creating a plurality of attributes that correspond to a plurality of records in a NoSQL database, wherein each attribute of the plurality of attributes comprises data from a corresponding plurality of record fields; creating an index based on the plurality of attributes; loading, in a memory, a plurality of attributes that correspond to a subset of the index as keys in a key-value pair and a plurality of identifiers that correspond to a plurality of records that correspond to the plurality of attributes as values in the key-value pair; sorting, in the memory, the plurality of attributes that correspond to the subset of the index; and identifying, in the memory, any duplicate attributes from the sorted plurality of attributes, wherein any identifiers that correspond to the any duplicate attributes also identify records in the NoSQL database to be evaluated as to whether the identified records are duplicates. 7. The non-transitory machine-readable medium of claim 6 , wherein the plurality of attributes that correspond to the plurality of records in a NoSQL database is based on an alphanumeric combination of the corresponding plurality of record fields and the index is based on an alphanumeric subset of the alphanumeric combination. 8. The non-transitory machine-readable medium of claim 6 , wherein the steps further comprise: determining whether to delete a record that is associated with a duplicate attribute; and deleting the record from the memory in response to a determination to delete the record associated with the duplicate attribute. 9. The non-transitory machine-readable medium of claim 6 , wherein the steps further comprise: determining whether to merge a plurality of records that are associated with a plurality of duplicate attributes; and merging the plurality of records in the memory in response to a determination to merge the plurality of records associated with the plurality of duplicate attributes. 10. The non-transitory machine-readable medium of claim 6 , wherein the steps further comprise: loading, in the memory, a plurality of attributes that correspond to the index as the keys in the key-value pair and a plurality of identifiers that correspond to a plurality of records that correspond to the plurality of attributes as the values in the key-value pair; receiving a request for at least one record that corresponds to an attribute of the plurality of attributes; and loading, in the memory, the at least one record based on an identifier that corresponds to the attribute of the plurality of attributes. 11. A method for creating indices and loading key-value pairs for NoSQL databases, the method comprising: creating a plurality of attributes that correspond to a plurality of records in a NoSQL database, wherein each attribute of the plurality of attributes comprises data from a corresponding plurality of record fields; creating an index based on the plurality of attributes; loading, in a memory, a plurality of attributes that correspond to a subset of the index as keys in a key-value pair and a plurality of identifiers that correspond to a plurality of records that correspond to the plurality of attributes as values in the key-value pair; sorting, in the memory, the plurality of attributes that correspond to the subset of the index; and identifying, in the memory, any duplicate attributes from the sorted plurality of attributes, wherein any identifiers that correspond to the any duplicate attributes also identify records in the NoSQL database to be evaluated as to whether the identified records are duplicates. 12. The method of claim 11 , wherein the plurality of attributes that correspond to the plurality of records in a NoSQL database is based on an alphanumeric combination of the corresponding plurality of record fields and the index is based on an alphanumeric subset of the alphanumeric combination. 13. The method of claim 11 , wherein the method further comprises: determining whether to delete a record that is associated with a duplicate attribute; and deleting the record from the memory in response to a determination to delete the record associated with the duplicate attribute. 14. The method of claim 11 , wherein the method further comprises: determining whether to merge a plurality of records that are associated with a plurality of duplicate attributes; and merging the plurality of records in the memory in response to a determination to merge the plurality of records associated with the plurality of duplicate attributes. 15. The method of claim 11 , wherein the method further comprises: loading

Assignees

Inventors

Classifications

  • Physics · mapped topic

  • Physics · mapped topic

  • Physics · mapped topic

  • Indexing structures · CPC title

  • G06F16/215Primary

    Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9378263B2 cover?
Systems and methods are provided for creating indices and loading key-value pairs for NoSQL databases. Attributes are created that correspond to records in a NoSQL database based on corresponding record fields. An index is created based on the attributes. A memory is loaded with attributes that correspond to a subset of the index as keys in a key-value pair and identifiers that correspond to re…
Who is the assignee on this patent?
Salesforce Com Inc
What technology area does this patent fall under?
Primary CPC classification G06F17/30587. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 28 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).