Process and Framework For Facilitating Information Sharing Using a Distributed Hypergraph
US-2019278760-A1 · Sep 12, 2019 · US
US12339775B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12339775-B2 |
| Application number | US-202218145565-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 22, 2022 |
| Priority date | Aug 17, 2022 |
| Publication date | Jun 24, 2025 |
| Grant date | Jun 24, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present invention relates to a hardware accelerator for hypergraph processing and its operating method, the hardware accelerator comprising: a data loader: for, in the presence of a data-centric load-trigger-reduce execution model, reading hypergraph partition data from an off-chip memory successively according to hypergraph data structure and an order of hypergraph partitions; an address translator, for deploying the hypergraph data into a private register of a processor and/or into a buffer memory according to a priority level of loaded data, and recording corresponding offset information; a task trigger, for generating computing tasks according to the loaded data, and scheduling the computing tasks into the processor; the processor, for receiving and executing the computing tasks; a reducer, for scheduling intermediate results into a first-priority-data reducer unit or a second-priority-data reducer unit depending on the priority level of the data so as to execute a reducing operation for the intermediate results. In view of the shortcomings of task-centric hardware accelerators, the present invention can prevent any possible data conflict during parallel execution of multiple processing units.
Opening claim text (preview).
What is claimed is: 1. A hardware accelerator for hypergraph processing, at least comprising: a data loader, configured to, in the presence of a data-centric load-trigger-reduce execution model, read hypergraph partitioning data from an off-chip memory successively according to hypergraph data structure and an order of hypergraph partitions; an address translator, configured to deploy the hypergraph partitioning data into at least one of a private register of a processor and a buffer memory according to a priority level of loaded data, and to record corresponding offset information into a Hash table; a task trigger, configured to generate computing tasks according to the loaded data, and to schedule the computing tasks into the processor; the processor, comprising at least one processing unit, which receives and executes the computing tasks; a reducer, configured to schedule intermediate results into a first-priority-data reducer unit in response to determining the priority level of the loaded data is a first-priority level or a second-priority-data reducer unit in response to determining the priority level of the loaded data is a second-priority level so as to execute a reducing operation for the intermediate results. 2. The hardware accelerator for hypergraph processing of claim 1 , wherein the reducer further comprises an activating unit, wherein in response to determining the intermediate results are scheduled into the first-priority-data reducer unit, the first-priority-data reducer unit reads data from the processing unit, and reduces the intermediate results based on a reducing tree, wherein the reducing operation is not executed until a present hypergraph partition has been completely processed; and in response to determining the intermediate results are scheduled into the second-priority-data reducer unit, the second-priority-data reducer unit acquires intermediate data from the processing unit, and reduces the intermediate results based on a sorting-reducing network; and the activating unit alters the activation states of vertices or hyperedges according to changes of vertex data or hyperedge data, wherein the reducing operation and the computing task are executed simultaneously. 3. The hardware accelerator for hypergraph processing of claim 2 , wherein the address translator has at least one Hash table that is configured to store mapping relationship and an offsets distributor that is configured to generate the mapping relationship, wherein after receiving a vertex index value in the first-priority-level data, the address translator searches the Hash table for a corresponding table entry, and in response to determining the table entry has not been initialized, the offsets distributor generates a new private register address and assigns it to a first-priority-level vertex, so that vertex data in the first-priority-level data are stored at a corresponding address in the private register, and in response to determining the table entry has been initialized, the address translator directly returns the found mapping relationship. 4. The hardware accelerator for hypergraph processing of claim 3 , wherein the hardware accelerator further comprises the buffer memory, wherein the buffer memory comprises a vertex data memory, a hyperedge data memory, and bipartite edge FIFO queue; and the data loader loads selected activated partition from the off-chip memory to the on-chip memory, and sends the hypergraph partitioning data to the buffer memory for storage based on data attributes, wherein the data loader loads the vertex data to the vertex memory, loads the hyperedge data to the hyperedge memory, and stream-loads the bipartite edge data to the bipartite edge FIFO queue, wherein the data are filtered with activation information during loading. 5. The hardware accelerator for hypergraph processing of claim 4 , wherein the task trigger acquires bipartite edges from the bipartite edge FIFO queue and generates corresponding computing tasks. 6. The hardware accelerator for hypergraph processing of claim 5 , wherein after acquiring the bipartite edges, the task trigger acquires at least one of a storage offset and addresses of corresponding vertices and hyperedges from the address translator, and packages and distributes the bipartite edges and the acquired addresses to currently idle processing units. 7. The hardware accelerator for hypergraph processing of claim 6 , wherein the first-priority-data reducer unit at least comprises a reducing tree, a result buffer queue, and a result writing-back unit, wherein after all the computing tasks of the present hypergraph partition have been executed and completed, the first-priority-data reducer unit reads the data from the private registers of the processing units one by one and inputs the data into the reducing tree; a final reduction result of the intermediate data is stored in the result buffer queue, and the result buffer queue provides the data to the result writing-back units so that the data are written back into an on-chip shared memory; and for the result buffer queue, the data are provided to the result writing-back unit one by one. 8. The hardware accelerator for hypergraph processing of claim 7 , wherein the second-priority-data reducer unit at least comprises a sorting-reducing network, an input buffer queue, an output buffer queue, and a result writing-back unit, wherein outputs generated as a result of that the processing unit processes the second-priority-level data are stored into an input buffer queue of the second-priority-data reducer unit, so that the processing unit provides data to the sorting-reducing network; the input buffer queue inputs intermediate results to the reducing network at a predetermined amount every time according to a scale of the sorting-reducing network, for the sorting-reducing network to sort and reduce the input intermediate results in the form of pipeline, so that the output results are ordered and values having the same index are reduced; and the output results of the sorting-reducing network are stored into the output buffer queue, so that the buffer queue provides data to the result writing-back unit and the data are written back into the on-chip shared memory. 9. The hardware accelerator for hypergraph processing of claim 8 , wherein the data loader is wired to the address translator for data information transmission, the data loader is wired to the activating unit in the reducer, so that the data loader reads information from the partitioning table in the reducer, the data loader is wired to and therefore has data transmitting relationship with the vertex data memory, the hyperedge data memory, and the bipartite edge FIFO queue, the vertex data memory, the hyperedge data memory, and the bipartite edge FIFO queue are wired to the task trigger, the address translator is wired to the task trigger, the address translator is wired to and therefore has information transmitting relationship with the first-priority-data reducer unit in the reducer, the reducer is wired to and therefore has data transmitting relationship with the vertex data memory and the hyperedge data memory, respectively. 10. The hardware accelerator for hypergraph processing of claim 9 , wherein the processor is wired to and therefore has information transmitting relationship with the vertex data memory, the hyperedge data memory, the task trigger, and the reducer, respectively, the data loader is connected to the data input port of the hardware accelerator, so as to receive the hypergraph partitioning data. 11. An operating method of hardware accelerator for hypergraph processing, the method at least comprising: i
Performance improvement · CPC title
Address translation · CPC title
hash tables · CPC title
Graphs; Linked lists (G06F16/9027 takes precedence) · CPC title
Multiuser, multiprocessor or multiprocessing cache systems · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.