Optimized record placement in graph database

US12038967B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12038967-B2
Application numberUS-202217662875-A
CountryUS
Kind codeB2
Filing dateMay 11, 2022
Priority dateJun 29, 2017
Publication dateJul 16, 2024
Grant dateJul 16, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and systems are disclosed for optimizing record placement in a graph by minimizing fragmentation when writing data. Issues with fragmented data within a graph database are addressed on the record level by placing data that is frequently accessed together contiguously within memory. For example, a dynamic rule set may be developed based on dynamically analyzing access patterns of the graph database, policies, system characteristics and/or other heuristics. Based on statistics regarding normal query patterns, the systems and methods may identify an optimal position for certain types of edges that are often traversed with respect to particular types of nodes.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: a processing unit; and memory coupled to the processing unit, the memory comprising computer executable instructions that, when executed by the processing unit, perform operations comprising: identifying a dirty entity in a graph database; determining occurrence of a condition triggering defragmentation of the graph database, the condition including at least one of: determining a number of dirty entities in the graph database meets a threshold number; determining a deletion of data from at least one record in the graph database; or determining expiration of a periodic time period; in response to the occurrence of the condition, identifying a ruleset for record placement within a memory space associated with the graph database; based on the ruleset, determining a number of memory records for storing the dirty entity as a first node in the graph database and a storage location in the memory space for the number of memory records, wherein the ruleset indicates that a memory record for the first node is to be stored in the memory space near at least one of: a memory record for a second node that is related to the first node based on an access frequency of the second node; or a memory record for an edge that is related to the first node based on an access frequency of the edge; allocating the number of memory records to the storage location in a contiguous block of memory records; and storing the dirty entity in the number of memory records allocated in the contiguous block of memory records. 2. The system of claim 1 , wherein the dirty entity is an entity for which a property of the entity has been altered or added. 3. The system of claim 2 , wherein identifying the dirty entity comprises determining data associated with the property is not placed adjacent in the memory space to other data of the dirty entity. 4. The system of claim 3 , wherein, in response to determining the data associated with the property is not placed adjacent in the memory space to other data of the dirty entity, the dirty entity is placed on a defragmentation queue. 5. The system of claim 1 , wherein the dirty entity is an entity for which an edge has been added to the first node. 6. The system of claim 5 , wherein identifying the dirty entity comprises determining the edge is not collocated in the memory space with the first node. 7. The system of claim 1 , wherein ruleset is derived based on access patterns associated with the graph database. 8. The system of claim 7 , wherein the access patterns statistically characterize edges traversed between nodes within the graph database for queries of the graph database. 9. The system of claim 8 , wherein statistically characterizing the edges traversed comprises at least one of: identifying one or more edge types corresponding to the edges traversed; or identifying one or more node types traversed to or from via the edges traversed. 10. The system of claim 1 , wherein the ruleset comprises rules for defragmenting memory records stored in the graph database based on at least one of node type or edge type. 11. The system of claim 1 , wherein storing the dirty entity in the number of memory records comprises: relocating the dirty entity from a first memory record that is not within the contiguous block of memory records to the contiguous block of memory records. 12. The system of claim 11 , wherein storing the dirty entity in the number of memory records further comprises: relocating the memory record for the second node or the memory record for the edge to the contiguous block of memory records. 13. The system of claim 11 , wherein relocating the dirty entity from the first memory record to the contiguous block of memory records defragments the graph database. 14. A method comprising: identifying a dirty entity in a graph database; determining occurrence of a condition triggering defragmentation of the graph database, the condition including at least one of: determining a defragmentation queue length has been met; determining at least one new record has been added to the graph database; or determining at least one existing record in the graph database has been modified; in response to the occurrence of the condition, identifying a ruleset for record placement within a memory space associated with the graph database; based on the ruleset, determining a number of memory records for storing the dirty entity as a first node in the graph database and a storage location in the memory space for the number of memory records, wherein the ruleset indicates that a memory record for the first node is to be stored in the memory space near a memory record for a second node that is related to the first node or a memory record for an edge that is related to the first node in accordance with at least one of: arranging records corresponding to a property contiguously; placing property data frequently used within a context of the dirty entity in closer proximity to the dirty entity than to a different entity; or sorting records storing edges by edge type; allocating the number of memory records to the storage location in a contiguous block of memory records; and storing the dirty entity in the number of memory records allocated in the contiguous block of memory records. 15. The method of claim 14 , further comprising: receiving a request to defragment the graph database; in response to the request, identifying the dirty entity in the graph database, wherein identifying the dirty entity in the graph database comprises: marking the dirty entity as dirty; and adding the dirty entity to a defragmentation queue. 16. The method of claim 15 , wherein identifying the dirty entity in the graph database further comprises: retrieving a property of the dirty entity, wherein the property has been altered or added to an entity to create the dirty entity. 17. The method of claim 15 , wherein items of the defragmentation queue are retrieved and defragmented to defragment the graph database based on a triggering condition. 18. The method of claim 14 , wherein the contiguous block of memory records is adjacent to the memory record for the second node or the memory record for the edge. 19. The method of claim 14 , wherein storing the dirty entity in the number of memory records causes relocation of the second node or the memory record for the edge to a memory record adjacent to the contiguous block of memory records. 20. A device comprising: a processing unit; and memory coupled to the processing unit, the memory comprising computer executable instructions that, when executed by the processing unit, perform operations comprising: identifying an entity in a graph database as a dirty entity based on a property of the entity being added or altered; determining occurrence of a condition triggering defragmentation of the graph database, the condition including at least one of: determining a number of dirty entities in the graph database meets a threshold number; determining a deletion of data from at least one record in the graph database; or determining expiration of a periodic time period; in response to the occurrence of the condition and based on a ruleset for record placement within a memory space associated with the graph database, determining a number of memory records for storing the dirty entity as a first node in the graph database and a storage location in the memory space for the number of memory records; allocating the numb

Assignees

Inventors

Classifications

  • Details of free space management performed by the file system (saving storage space on storage systems G06F3/0608; management of blocks in storage devices G06F3/064) · CPC title

  • Saving storage space on storage systems · CPC title

  • Management of blocks · CPC title

  • Details of de-fragmentation performed by the file system (saving storage space on storage systems G06F3/0608; management of blocks in storage devices G06F3/064) · CPC title

  • Defragmentation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12038967B2 cover?
Methods and systems are disclosed for optimizing record placement in a graph by minimizing fragmentation when writing data. Issues with fragmented data within a graph database are addressed on the record level by placing data that is frequently accessed together contiguously within memory. For example, a dynamic rule set may be developed based on dynamically analyzing access patterns of the gra…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/2455. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 16 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).