Optimized record placement in graph database

US11354348B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11354348-B2
Application numberUS-201715636742-A
CountryUS
Kind codeB2
Filing dateJun 29, 2017
Priority dateJun 29, 2017
Publication dateJun 7, 2022
Grant dateJun 7, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and systems are disclosed for optimizing record placement in a graph by minimizing fragmentation when writing data. Issues with fragmented data within a graph database are addressed on the record level by placing data that is frequently accessed together contiguously within memory. For example, a dynamic rule set may be developed based on dynamically analyzing access patterns of the graph database, policies, system characteristics and/or other heuristics. Based on statistics regarding normal query patterns, the systems and methods may identify an optimal position for certain types of edges that are often traversed with respect to particular types of nodes.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for storing data in a graph database, comprising: receiving a request to write data to the graph database, wherein the graph database comprises a plurality of nodes having different node types, a plurality of edges having different edge types, and a plurality of properties; parsing the request to identify at least one entity, wherein the at least one entity corresponds to a first node of a first node type, and wherein the at least one entity is related to at least a second node of a second node type or an edge of an edge type within the graph database; retrieving a dynamic ruleset for record placement within a memory space associated with the graph database, the dynamic ruleset being automatically updated based, at least in part, on a statistical analysis of traversals across the edge types with respect to the node types in the graph database in response to queries; based at least in part on the dynamic ruleset, determining (i) a number of memory records for storing the at least one entity as the first node in the graph database and (ii) a storage location in the memory space for the number of memory records, wherein the dynamic ruleset indicates that a memory record for the first node should be stored in the memory space near at least one of a memory record for the second node of the second node type or a memory record for the edge of the edge type based on a frequency the second node or the edge is accessed within a context of the first node; allocating the number of memory records to the storage location in a contiguous block of available memory records; and storing the at least one entity and data associated with one of the second node or the edge in the allocated contiguous block of available memory records. 2. The computer-implemented method of claim 1 , wherein the dynamic ruleset is derived based on one or more of: statistical access patterns associated with the graph database, one or more policies, system configurations, system characteristics, or heuristics. 3. The computer-implemented method of claim 2 , wherein the access patterns are evaluated during query execution. 4. The computer-implemented method of claim 2 , wherein the access patterns statistically characterize the traversal of edges from one node to another within the graph database for a plurality of queries. 5. The computer-implemented method of claim 4 , wherein statistically characterizing the traversal of edges comprises one or more of: identifying one or more edge types corresponding to the edges traversed; identifying one or more node types traversed from; or identifying one or more node types traversed to. 6. The computer-implemented method of claim 5 , further comprising: logging a number of each of the one or more node types traversed from; and logging a number of each of the one or more node types traversed to. 7. The computer-implemented method of claim 1 , wherein the dynamic ruleset for record placement comprise one or more of: arranging records corresponding to a property contiguously; placing property data that is more frequently used within a context of the at least one entity in closer proximity to the at least one entity than to at least one other entity; allocating contiguous available records for storing edges with corresponding property records of a node; sorting records storing edges by edge type; or placing an edge record adjacent to a node that is most likely to be the node traversed from. 8. The computer-implemented method of claim 1 , wherein a first rule of the dynamic ruleset is used to determine the number of memory records for storing the at least one entity and a second rule of the dynamic ruleset is used to determine the storage location in the memory space for the number of memory records. 9. The computer-implemented method of claim 8 , wherein the location is adjacent to a record storing at least one of the related node and the related edge. 10. The computer-implemented method of claim 8 , wherein the second rule and the first rule are different. 11. The computer-implemented method of claim 8 , wherein the second rule and the first rule are the same. 12. The computer-implemented method of claim 8 , further comprising: allocating the contiguous block of available records at the location in memory. 13. The computer-implemented method of claim 8 , further comprising: determining that the number of records is greater than a number of available records at the location in memory; and marking the at least one entity as dirty. 14. A computing device, comprising: at least one processing unit; and at least one memory storing computer executable instructions for storing data to a graph database, the instructions when executed by the at least one processing unit causing the computing device to: receive a request to write data to the graph database, wherein the graph database comprises a plurality of nodes having different node types and a plurality of edges; parse the request to identify at least one entity, wherein the at least one entity corresponds to a first node of a first node type, and wherein the at least one entity is related to at least a second node of a second node type or an edge of an edge type within the graph database; retrieve a dynamic ruleset for record placement within a memory space associated with the graph database, the dynamic ruleset being automatically updated based, at least in part, on a statistical analysis of traversals across the edge types with respect to the node types in the graph database in response to queries; based at least in part on the dynamic ruleset, determine (i) a number of memory records for storing the at least one entity as the first node in the graph database and (ii) a storage location in the memory space for the number of memory records, wherein the dynamic ruleset indicates that a memory record for the first node should be stored in the memory space near at least one of a memory record for the second node of the second node type or a memory record for the edge of the edge type based on a frequency the second node or the edge is accessed within a context of the first node; determine a location in memory for storing the at least one entity; allocate the number of memory records to the storage location in a contiguous block of available memory records at the location in memory; and store the at least one entity and data associated with one of the second node or the edge in the allocated contiguous block of available memory records. 15. The computing system of claim 14 , wherein the location is in proximity to a record storing at least one of the node and the edge. 16. The computing system of claim 14 , wherein the dynamic ruleset is derived based on one or more of: statistical access patterns associated with the graph database, one or more policies, system configurations, system characteristics or heuristics. 17. The computing system of claim 16 , wherein the access patterns statistically characterize the traversal of edges from one node to another within the graph database for a plurality of queries. 18. The computing system of claim 14 , the computer executable instructions further causing the computing device to: determine that the number of records is greater than a number of available records at the location in memory; allocate the contiguous block of available records at a different location in memory; and mark the at least one entity as dirty. 19. A computer storage medium storing computer execut

Assignees

Inventors

Classifications

  • Details of de-fragmentation performed by the file system (saving storage space on storage systems G06F3/0608; management of blocks in storage devices G06F3/064) · CPC title

  • Interfaces specially adapted for storage systems · CPC title

  • Improving I/O performance · CPC title

  • Management of blocks · CPC title

  • Details of free space management performed by the file system (saving storage space on storage systems G06F3/0608; management of blocks in storage devices G06F3/064) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11354348B2 cover?
Methods and systems are disclosed for optimizing record placement in a graph by minimizing fragmentation when writing data. Issues with fragmented data within a graph database are addressed on the record level by placing data that is frequently accessed together contiguously within memory. For example, a dynamic rule set may be developed based on dynamically analyzing access patterns of the gra…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/532. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 07 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).