Method to share a coherent accelerator context inside the kernel
US-2017109290-A1 · Apr 20, 2017 · US
US10157144B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10157144-B2 |
| Application number | US-201815861219-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 3, 2018 |
| Priority date | Oct 16, 2015 |
| Publication date | Dec 18, 2018 |
| Grant date | Dec 18, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments disclose techniques for sharing a context for a coherent accelerator in a kernel of a computer system. A request is received from a first application to perform an I/O operation within a kernel context. The request specifies a first effective address distinct to the first application. The first effective address specifies a location in a first effective address space and a first effective segment identifier. The first effective address is remapped to a second effective address. The second effective address specifies a location in a second effective address space of the kernel context and a second effective segment identifier. A virtual address mapping to a virtual address space within the kernel context is determined. The virtual address is translated to a physical memory address.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method to provide improved process scalability via kernel-context sharing by multiple user-space processes, the computer-implemented method comprising: receiving a request from a first user-space process to perform an I/O operation within a kernel context, wherein the request specifies a first effective address of a local effective address space distinct to the first user-space process, wherein the first effective address specifies a location in the local effective address space of the kernel context and a first effective segment identifier; remapping, by operation of one or more computer processors, the first effective address to a second effective address in a global effective address space shared by the first user-space process and at least a second user-space process and in order to avoid conflicting addresses therebetween, wherein the second effective address specifies a location in the global effective address space of the kernel context and a second effective segment identifier; and upon determining, by a lookup using the second effective segment identifier on a page table and a shared segment table, a virtual address that maps to a virtual address space within the kernel context, translating the virtual address to a physical memory address, whereafter the I/O operation is performed based on the physical memory address, thereby providing improved process scalability via kernel-context sharing by multiple user-space processes including the first and second user-space processes. 2. The computer-implemented method of claim 1 , wherein the kernel context and segment table are shared with the first and second user-space processes. 3. The computer-implemented method of claim 1 , wherein the first and second addresses further specify a page number and a byte offset. 4. The computer-implemented method of claim 3 , wherein determining the virtual address comprises: determining, via the shared segment table based on the second effective segment identifier, a virtual segment identifier; and performing a lookup operation in the page table using the virtual segment identifier, page number, and the byte offset. 5. The computer-implemented method of claim 1 , further comprising: inserting the I/O operation into a command queue. 6. The computer-implemented method of claim 5 , wherein the first user-space process blocks other I/O operations until the I/O operation is completed. 7. The computer-implemented method of claim 1 , wherein the I/O operation is performed via a coherent accelerator. 8. The computer-implemented method of claim 1 , wherein the request is to a coherent accelerator to perform the I/O operation within the kernel context, wherein the coherent accelerator shares virtual memory with the one or more computer processors, wherein the computer-implemented method is performed by a kernel device driver associated with the coherent accelerator, wherein the computer-implemented method further comprises providing an operating system that includes a kernel space in which an operating system kernel and the kernel device driver execute. 9. The computer-implemented method of claim 8 , wherein the kernel device driver interfaces between the first and second user-space processes and the coherent accelerator, wherein the kernel device driver controls an accelerator function unit of the coherent accelerator via: (i) attaching and detaching contexts to the coherent accelerator on behalf of application memory; (ii) performing memory-mapped I/O to the coherent accelerator; and (iii) registering a kernel context in a storage device. 10. The computer-implemented method of claim 9 , wherein the improved process scalability comprises a process scalability characterized by a total count of processes to which the coherent accelerator is exploitable, wherein the coherent accelerator comprises a field-programmable gate array (FPGA)-based coherent accelerator, wherein the first and second user-space processes are of distinct, first and second applications, wherein the computer-implemented method further comprises outputting an indication that the I/O operation has been performed. 11. The computer-implemented method of claim 10 , wherein the kernel context and segment table are each shared between the first and second user-space processes, wherein the first and second addresses further specify a page number and a byte offset, wherein the computer-implemented method further comprises: subsequent to the I/O operation being performed, deleting the second effective address from the shared segment table. 12. The computer-implemented method of claim 11 , wherein the first effective address is remapped to the second effective address in a manner transparent to the first user-space process, wherein determining the virtual address comprises: determining, via the shared segment table based on the second effective segment identifier, a virtual segment identifier; and performing a lookup operation in the page table using the virtual segment identifier, page number, and the byte offset. 13. The computer-implemented method of claim 12 , further comprising: inserting the I/O operation into a command queue, wherein the first user-space process blocks other I/O operations until the I/O operation is completed, wherein the I/O operation is performed via the coherent accelerator. 14. The computer-implemented method of claim 1 , wherein the request is to a coherent accelerator to perform the I/O operation within the kernel context. 15. The computer-implemented method of claim 14 , wherein the coherent accelerator shares virtual memory with the one or more computer processors. 16. The computer-implemented method of claim 1 , wherein the computer-implemented method is performed by a device driver associated with a coherent accelerator. 17. The computer-implemented method of claim 1 , wherein the improved process scalability comprises a process scalability characterized by a total count of processes to which a coherent accelerator is exploitable. 18. The computer-implemented method of claim 1 , wherein the first and second user-space processes are of distinct, first and second applications. 19. The computer-implemented method of claim 1 , subsequent to the I/O operation being performed, deleting the second effective address from the shared segment table. 20. The computer-implemented method of claim 19 , wherein the second effective address is deleted by a device driver associated with a coherent accelerator.
using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] · CPC title
in relation to response time · CPC title
Single storage device · CPC title
Command handling arrangements, e.g. command buffers, queues, command scheduling · CPC title
Latency reduction · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.