Data processing system, computing node, and data processing method

US10567494B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10567494-B2
Application numberUS-201715667634-A
CountryUS
Kind codeB2
Filing dateAug 3, 2017
Priority dateFeb 6, 2015
Publication dateFeb 18, 2020
Grant dateFeb 18, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A data processing system, a computing node, and a data processing method are provided. The data processing system includes a management node and a first class of computing nodes. The management node is configured to allocate first processing tasks to the first class of computing nodes. At least two computing nodes in the first class of computing nodes concurrently perform the first processing tasks allocated by the management node. A computing node performs a combine2 operation and a reduce2 operation on a data block Mx and a data block V1x, to obtain a first intermediate result. Then, the management node obtains a processing result for a to-be-processed dataset according to first intermediate results obtained by the first class of computing nodes. According to the data processing system, when a combine operation and a reduce operation are being performed on data blocks, memory space occupied by computation can be reduced.

First claim

Opening claim text (preview).

What is claimed is: 1. A data processing system, comprising a management node including a processor and a plurality of computing nodes that form a first class of computing nodes, each of the first class of computing nodes including a processor, wherein the processor of the management node is configured to: allocate a first processing task to each of at least two computing nodes in the first class of computing nodes, wherein a computing node FC x is the x th computing node in the at least two computing nodes, and x is a positive integer, and wherein the at least two computing nodes in the first class of computing nodes concurrently perform the first processing tasks allocated by the management node; the processor of the computing node FC x is configured to: obtain, according to the first processing task allocated by the management node, a data block M x and a data block V 1x in a to-be-processed dataset, wherein the data block M x is a matrix comprising m rows and n columns of data, the data block V 1x is a vector comprising n-dimensional data, m and n are positive integers, and n is not less than 2; and perform a combine 2 operation and a reduce 2 operation on the data block M x and the data block V 1x , to obtain a first intermediate result V′ x ,wherein the first intermediate result V′ x is a vector comprising m-dimensional data; and the first intermediate result V′ x has an element v′ i , wherein i is a variant, the value of i ranges from 1 to m, v′ i =v′ i,n , v′ i,n is obtained according to v′ i,j =reduce 2 (v′ i,j−1 ,combine 2 (m, i,j , v j )) ,m i,j is an element in the data block M x , v j is an element in the data block V 1x , j is a variant, and the value of j ranges from 1 to n; and the processor of the management node is further configured to: obtain a first processing result for the to-be-processed dataset according to first intermediate results obtained by the at least two computing nodes in the first class of computing nodes. 2. The data processing system according to claim 1 , wherein the data processing system further comprises at least one computing node that form a second class of computing nodes, each of the second class of computing nodes comprising a processor, and the processor of the management node is further configured to: allocate, according to the first intermediate results obtained by the at least two computing nodes in the first class of computing nodes, a second processing task to each of the at least one computing node in the second class of computing nodes, wherein a computing node SC y is the y th computing node in the at least one computing node, and y is a positive integer; the processor of the computing node SC y is configured to: obtain, according to the second processing task allocated by the management node, the first intermediate results, wherein the first intermediate results obtained by the SC y are first intermediate results obtained according to data blocks in one row in the to-be-processed dataset; and perform a reduce 2 operation on the first intermediate results, to obtain a second intermediate result V″ y , wherein the second intermediate result V″ y is a vector comprising m-dimensional data; and the processor of the management node is further configured to: obtain a second processing result for the to-be-processed dataset according to the second intermediate result obtained by the at least one computing node in the second class of computing nodes. 3. The data processing system according to claim 2 , wherein the to-be-processed dataset further comprises a data block V 2x , and the data block V 2x is a vector comprising m-dimensional data; and the processor of the management node is further configured to: allocate, according to the second intermediate result obtained by the at least one computing node in the second class of computing nodes, a third processing task to the at least one computing node in the second class of computing nodes, wherein the at least one computing node in the second class of computing nodes comprises the computing node SC y ; and the processor of the computing node SC y is further configured to: obtain the data block V 2x in the to-be-processed dataset according to the third processing task; and perform an assign operation on the second intermediate result V″ y and the data block V 2x , to obtain a third processing result for the to-be-processed dataset. 4. The data processing system according to claim 3 , wherein m=n, and the data block V 1x and the data block V 2x are a same data block. 5. The data processing system according to claim 2 , wherein when the second class of computing nodes comprises at least two computing nodes, the processors of the at least two computing nodes concurrently perform the second processing tasks allocated by the processor of the management node. 6. The data processing system according to claim 2 , wherein the management node is a physical computing device, a virtual machine or a central processing unit, wherein the first class of computing nodes comprises a plurality of physical computing devices or a plurality of virtual machines formed on a physical computing device, and wherein the second class of computing nodes comprises a plurality of physical computing devices or a plurality of virtual machines formed on a physical computing device. 7. A data processing system, comprising a management node including, a processor and a plurality of computing nodes that form a first class of computing nodes, each of the first class of computing nodes including a processor, wherein the processor of the management node is configured to: allocate a first processing task to each of at least two computing nodes in the first class of computing nodes, wherein a computing node FC x is the x th computing node in the at least two computing nodes, and x is a positive integer, and wherein the at least two computing nodes in the first class of computing nodes concurrently perform the first processing tasks allocated by the management node; the processor of the computing node FC x is configured to: obtain, according to the first processing task allocated by the management node, a data block M 1x and a data block M 2x that are in a to-be-processed dataset, wherein the data block M 1x is a matrix comprising m rows and n columns of data, the data block M 2x is a matrix comprising n rows and p columns of data, wherein m, n, and p are positive integers, and n is not less than 2; and perform a combine 2 operation and a reduce 2 operation on the data block M 1x and the data block M 2x , to obtain a first intermediate result M′ x , wherein the first intermediate result M′ x is a matrix comprising m rows and p columns of data; and the first intermediate result M′ x has an element m′ i,j , wherein i and j are variants, the value of i ranges from 1 to m, the value of j ranges from 1 to p, m ′ i,j =m ′ i,j,n is obtained according to m′ i,j,k =reduce 2 (m′ i,j,k−1 , combine 2 (m 1[i,k] ,m 2[k,j] )), m 1[i,k] is an element in the i th row and the k th column of the data block M 1x , M 2[k,j] is an element in the k th row and the j th column of the data block M 2x , k is a variant, and the value of k ranges from 1 to n; and the processor of the management node is further configured to: obtain a first processing result for the to-be-processed dataset according to first intermediate results obtained by the at least two computing nodes in the first class of computing nodes. 8. The data processing system according to claim 7 , wherein the data processing system further comprises at least one computing node that form a second class of computing nodes, each of the second class of computing nodes comprising a processor, and

Assignees

Inventors

Classifications

  • Neural networks · CPC title

  • based on compliance of requirements or conditions with available server resources · CPC title

  • Query processing support for facilitating data mining operations in structured databases · CPC title

  • Techniques for rebalancing the load in a distributed system · CPC title

  • G06F17/16Primary

    Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10567494B2 cover?
A data processing system, a computing node, and a data processing method are provided. The data processing system includes a management node and a first class of computing nodes. The management node is configured to allocate first processing tasks to the first class of computing nodes. At least two computing nodes in the first class of computing nodes concurrently perform the first processing t…
Who is the assignee on this patent?
Huawei Tech Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06F17/16. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 18 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).