Hardware accelerator for hypergraph processing and operating method thereof

US12339775B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12339775-B2
Application numberUS-202218145565-A
CountryUS
Kind codeB2
Filing dateDec 22, 2022
Priority dateAug 17, 2022
Publication dateJun 24, 2025
Grant dateJun 24, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present invention relates to a hardware accelerator for hypergraph processing and its operating method, the hardware accelerator comprising: a data loader: for, in the presence of a data-centric load-trigger-reduce execution model, reading hypergraph partition data from an off-chip memory successively according to hypergraph data structure and an order of hypergraph partitions; an address translator, for deploying the hypergraph data into a private register of a processor and/or into a buffer memory according to a priority level of loaded data, and recording corresponding offset information; a task trigger, for generating computing tasks according to the loaded data, and scheduling the computing tasks into the processor; the processor, for receiving and executing the computing tasks; a reducer, for scheduling intermediate results into a first-priority-data reducer unit or a second-priority-data reducer unit depending on the priority level of the data so as to execute a reducing operation for the intermediate results. In view of the shortcomings of task-centric hardware accelerators, the present invention can prevent any possible data conflict during parallel execution of multiple processing units.

First claim

Opening claim text (preview).

What is claimed is: 1. A hardware accelerator for hypergraph processing, at least comprising: a data loader, configured to, in the presence of a data-centric load-trigger-reduce execution model, read hypergraph partitioning data from an off-chip memory successively according to hypergraph data structure and an order of hypergraph partitions; an address translator, configured to deploy the hypergraph partitioning data into at least one of a private register of a processor and a buffer memory according to a priority level of loaded data, and to record corresponding offset information into a Hash table; a task trigger, configured to generate computing tasks according to the loaded data, and to schedule the computing tasks into the processor; the processor, comprising at least one processing unit, which receives and executes the computing tasks; a reducer, configured to schedule intermediate results into a first-priority-data reducer unit in response to determining the priority level of the loaded data is a first-priority level or a second-priority-data reducer unit in response to determining the priority level of the loaded data is a second-priority level so as to execute a reducing operation for the intermediate results. 2. The hardware accelerator for hypergraph processing of claim 1 , wherein the reducer further comprises an activating unit, wherein in response to determining the intermediate results are scheduled into the first-priority-data reducer unit, the first-priority-data reducer unit reads data from the processing unit, and reduces the intermediate results based on a reducing tree, wherein the reducing operation is not executed until a present hypergraph partition has been completely processed; and in response to determining the intermediate results are scheduled into the second-priority-data reducer unit, the second-priority-data reducer unit acquires intermediate data from the processing unit, and reduces the intermediate results based on a sorting-reducing network; and the activating unit alters the activation states of vertices or hyperedges according to changes of vertex data or hyperedge data, wherein the reducing operation and the computing task are executed simultaneously. 3. The hardware accelerator for hypergraph processing of claim 2 , wherein the address translator has at least one Hash table that is configured to store mapping relationship and an offsets distributor that is configured to generate the mapping relationship, wherein after receiving a vertex index value in the first-priority-level data, the address translator searches the Hash table for a corresponding table entry, and in response to determining the table entry has not been initialized, the offsets distributor generates a new private register address and assigns it to a first-priority-level vertex, so that vertex data in the first-priority-level data are stored at a corresponding address in the private register, and in response to determining the table entry has been initialized, the address translator directly returns the found mapping relationship. 4. The hardware accelerator for hypergraph processing of claim 3 , wherein the hardware accelerator further comprises the buffer memory, wherein the buffer memory comprises a vertex data memory, a hyperedge data memory, and bipartite edge FIFO queue; and the data loader loads selected activated partition from the off-chip memory to the on-chip memory, and sends the hypergraph partitioning data to the buffer memory for storage based on data attributes, wherein the data loader loads the vertex data to the vertex memory, loads the hyperedge data to the hyperedge memory, and stream-loads the bipartite edge data to the bipartite edge FIFO queue, wherein the data are filtered with activation information during loading. 5. The hardware accelerator for hypergraph processing of claim 4 , wherein the task trigger acquires bipartite edges from the bipartite edge FIFO queue and generates corresponding computing tasks. 6. The hardware accelerator for hypergraph processing of claim 5 , wherein after acquiring the bipartite edges, the task trigger acquires at least one of a storage offset and addresses of corresponding vertices and hyperedges from the address translator, and packages and distributes the bipartite edges and the acquired addresses to currently idle processing units. 7. The hardware accelerator for hypergraph processing of claim 6 , wherein the first-priority-data reducer unit at least comprises a reducing tree, a result buffer queue, and a result writing-back unit, wherein after all the computing tasks of the present hypergraph partition have been executed and completed, the first-priority-data reducer unit reads the data from the private registers of the processing units one by one and inputs the data into the reducing tree; a final reduction result of the intermediate data is stored in the result buffer queue, and the result buffer queue provides the data to the result writing-back units so that the data are written back into an on-chip shared memory; and for the result buffer queue, the data are provided to the result writing-back unit one by one. 8. The hardware accelerator for hypergraph processing of claim 7 , wherein the second-priority-data reducer unit at least comprises a sorting-reducing network, an input buffer queue, an output buffer queue, and a result writing-back unit, wherein outputs generated as a result of that the processing unit processes the second-priority-level data are stored into an input buffer queue of the second-priority-data reducer unit, so that the processing unit provides data to the sorting-reducing network; the input buffer queue inputs intermediate results to the reducing network at a predetermined amount every time according to a scale of the sorting-reducing network, for the sorting-reducing network to sort and reduce the input intermediate results in the form of pipeline, so that the output results are ordered and values having the same index are reduced; and the output results of the sorting-reducing network are stored into the output buffer queue, so that the buffer queue provides data to the result writing-back unit and the data are written back into the on-chip shared memory. 9. The hardware accelerator for hypergraph processing of claim 8 , wherein the data loader is wired to the address translator for data information transmission, the data loader is wired to the activating unit in the reducer, so that the data loader reads information from the partitioning table in the reducer, the data loader is wired to and therefore has data transmitting relationship with the vertex data memory, the hyperedge data memory, and the bipartite edge FIFO queue, the vertex data memory, the hyperedge data memory, and the bipartite edge FIFO queue are wired to the task trigger, the address translator is wired to the task trigger, the address translator is wired to and therefore has information transmitting relationship with the first-priority-data reducer unit in the reducer, the reducer is wired to and therefore has data transmitting relationship with the vertex data memory and the hyperedge data memory, respectively. 10. The hardware accelerator for hypergraph processing of claim 9 , wherein the processor is wired to and therefore has information transmitting relationship with the vertex data memory, the hyperedge data memory, the task trigger, and the reducer, respectively, the data loader is connected to the data input port of the hardware accelerator, so as to receive the hypergraph partitioning data. 11. An operating method of hardware accelerator for hypergraph processing, the method at least comprising: i

Assignees

Inventors

Classifications

  • Performance improvement · CPC title

  • Address translation · CPC title

  • hash tables · CPC title

  • Graphs; Linked lists (G06F16/9027 takes precedence) · CPC title

  • Multiuser, multiprocessor or multiprocessing cache systems · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12339775B2 cover?
The present invention relates to a hardware accelerator for hypergraph processing and its operating method, the hardware accelerator comprising: a data loader: for, in the presence of a data-centric load-trigger-reduce execution model, reading hypergraph partition data from an off-chip memory successively according to hypergraph data structure and an order of hypergraph partitions; an address t…
Who is the assignee on this patent?
Univ Huazhong Science Tech, Zhejiang Lab
What technology area does this patent fall under?
Primary CPC classification G06F12/0806. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 24 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).