Atomic Object Reads for In-Memory Rack-Scale Computing

US2018173673A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2018173673-A1
Application numberUS-201715838514-A
CountryUS
Kind codeA1
Filing dateDec 12, 2017
Priority dateDec 15, 2016
Publication dateJun 21, 2018
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A distributed memory system including a plurality of chips, a plurality of nodes that are distributed across the plurality of chips such that each node is comprised within a chip, each node includes a dedicated local memory and a processor core, and each local memory is configured to be accessible over network communication, a network interface for each node, the network interface configured such that a corresponding network interface of each node is integrated in a coherence domain of the chip of the corresponding node, wherein each of the network interfaces are configured to support a one-sided operation, the network interface directly reading or writing in the dedicated local memory of the corresponding node without involving a processor core, and wherein the one-sided operation is configured such that the processor core of a corresponding node uses a protocol to directly inject a remote memory access for read or write request to the network interface of the node, the remote memory access request allowing to read or write an arbitrarily long region of a memory of a remote node,

First claim

Opening claim text (preview).

1 . A distributed memory system comprising: a plurality of chips; a plurality of nodes that are distributed across the plurality of chips such that each node is comprised within a chip, each node includes a dedicated local memory and a processor core, and each local memory is configured to be accessible over network communication; a network interface for each node, the network interface configured such that a corresponding network interface of each node is integrated in a coherence domain of the chip of the corresponding node; wherein each of the network interfaces are configured to support a one-sided operation, the network interface directly reading or writing in the dedicated local memory of the corresponding node without involving a processor core, wherein the one-sided operation is configured such that the processor core of a corresponding node uses a protocol to directly inject a remote memory access for read or write request to the network interface of the node, the remote memory access request allowing to read or write an arbitrarily long region of a memory of a remote node, wherein a network interface of a requesting node further includes a parser configured to parse the request and send the request to a network interface of a target remote servicing node, wherein the network interface of the target remote servicing node is configured to directly operate on an associated local memory according to the received request, without an involvement of a processor core of the target remote servicing node, and to reply to the requesting network interface with the requested data, if the request was a read, or a write acknowledgement, if the request was a write, and wherein a plurality of regions of the dedicated local memory of each node are organized by software as a set of data objects, each of the data objects having a standardized layout, including a header having a lock or a version, followed by a data of a data object, the network interface relying on a standardized data object memory layout and the integration of the network interface in a local coherence domain of the network interface, to identify a potential atomicity violation by snooping on coherence messages, thus enabling the network interface to perform one-sided atomic data object read operations. 2 . A network interface in a distributed memory system according to claim 1 , comprising: a plurality of object buffers, wherein each object buffer includes an object address field, followed by a plurality of object buffer entries, and wherein each object buffer entry is two bits intended to encode any one of four possible states from the following list: unused, used, pending, done, a default state being unused. 3 . A method for implementing a lightweight mechanism configured to extend a network interface to provide atomic reads of arbitrarily long data objects, each object comprising a header followed by data of the object stored in object buffers, the header including a lock or a version, the method comprising: integrating the network interface in a local coherence domain of the network interface; snooping on the network interface on coherence messages, to identify potential atomicity violations while a data object is being read; assigning by the network interface one of the available object buffers to the remote object read request, whenever the network interface receives a remote object read request from the network the network interface stores a base address of the object in an object address field of the object buffer, and the first N entries of the object buffer are marked as used, where N is the total number of cache blocks that the object requested by the remote object read request is comprised of, wherein an object buffer entry (i) corresponds to the cache block (i) of the requested object; speculatively sending by the network interface read requests for the cache blocks of the object and assessing a state of the lock or a state of the version, marking corresponding object buffer entries of the object as pending, consequently marking them as done as the data replies from memory arrive and sending the data replies back to the original requester through the network, wherein all data cache block reads completed by the network interface are speculative until the header of the object is retrieved and assessed, wherein when the state of the lock or the state of the version indicates a free object, a cache block read that has been speculatively completed prior to assessing the header of the object qualifies as valid; else, the speculative cache blocks reads fail, and the network interface performs a failure sequence, which involves either one of reading the object again, or sending a failure notification to a network interface that originally sent the remote object read request; in case the network interface receives a coherence invalidation message for an address that belongs in an address range of the data object, the network interface checks whether the invalidation matches a first entry of the object buffer, which corresponds to a header of the data object, and if this is the case, the object read request fails and the network interface performs a failure sequence for that object read request; checks whether the invalidation matches an entry of the object buffer that is not unused and is not the first, and if that object buffer's first entry is in the done state, the invalidation is ignored; otherwise, the reception of an invalidation for an entry of the object buffer that is in the done state results in the network interface performing a failure sequence for the object read request corresponding to that object buffer; in any other case, the reception of an invalidation is ignored; wherein a one-sided atomic data object read request successfully completes when all the cache blocks comprising the requested object have been read from the memory, and wherein the network interface frees the object buffer used for the request by resetting all of the entries of the object buffer to unused. 4 . The method of claim 3 , further comprising checking a number of entries in an object buffer, if the object buffer features fewer entries than a total of cache lines of the requested data object, binding the number of speculative cache block reads that the network interface can issue for that data object, by the available entries; if the access to the header of the data object has not completed, the network interface stalls the processing of the atomic read request of the data object until the first access completes, thereby only having a negative impact on performance, but not introducing a functionality limitation, such that the maximum size of the data object that the network interface can read atomically is not limited by the number of entries of the object buffers of the network interface.

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2018173673A1 cover?
A distributed memory system including a plurality of chips, a plurality of nodes that are distributed across the plurality of chips such that each node is comprised within a chip, each node includes a dedicated local memory and a processor core, and each local memory is configured to be accessible over network communication, a network interface for each node, the network interface configured su…
Who is the assignee on this patent?
Ecole Polytechnique Fed Lausanne Epfl
What technology area does this patent fall under?
Primary CPC classification G06F15/17331. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jun 21 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).