Distributed processing architecture

US11580388B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11580388-B2
Application numberUS-202016734092-A
CountryUS
Kind codeB2
Filing dateJan 3, 2020
Priority dateJan 3, 2020
Publication dateFeb 14, 2023
Grant dateFeb 14, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments of the present disclosure include techniques for processing neural networks. Various forms of parallelism may be implemented using topology that combines sequences of processors. In one embodiment, the present disclosure includes a computer system comprising a plurality of processor groups, the processor groups each comprising a plurality of processors. A plurality of network switches are coupled to subsets of the plurality of processor groups. A subset of the processors in the processor groups may be configurable to form sequences, and the network switches are configurable to form at least one sequence across one or more of the plurality of processor groups to perform neural network computations. Various alternative configurations for creating Hamiltonian cycles are disclosed to support data parallelism, pipeline parallelism, layer parallelism, or combinations thereof.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer system comprising: a plurality of processor groups, the processor groups comprising a plurality of series configured processors to process a partitioned neural network, wherein different processors in the plurality of processor groups process one or more different layers, stages, or model instances of the partitioned neural network; and a plurality of network switches coupled to subsets of the plurality of processor groups through edge processors of the series configured processors in each processor group, wherein at least a subset of the processors in the processor groups are configured to form sequences such that each processor communicates data for said layers, stages, or model instances with at least two adjacent processor in the sequence, and wherein the network switches are configurable to form at least one sequence across one or more of the plurality of processor groups to perform neural network computations. 2. The computer system of claim 1 wherein, during performance of the neural network computations, at least a first subset of processors in a first processor group and at least a second subset of processors in at least a second processor group are coupled together to form a sequence comprising a string of processors, wherein each processor communicates with an adjacent processor in the string of processors. 3. The computer system of claim 1 wherein one or more processor groups comprise processors configured in series to form a 1-dimensional processor array. 4. The computer system of claim 1 wherein one or more processor groups comprise processors configured as rows and columns to form a 2-dimensional processor array. 5. The computer system of claim 1 wherein the plurality of processors in the processor groups are configured in an N-dimensional processor array. 6. The computer system of claim 1 wherein: edge processor ports on opposite sides of a row of processors across a first plurality of processor groups are coupled to a first network switch; and edge processor ports on opposite sides of a column of processors across a second plurality of processor groups are coupled to second network switch. 7. The computer system of claim 1 wherein the processor groups are coupled to a plurality of row network switches and a plurality of column network switches across a plurality of switching planes corresponding to rows and columns of processors in each processor group. 8. The computer system of claim 7 wherein one or more of the row network switches are coupled to one or more of the column network switches. 9. The computer system of claim 1 wherein the processor groups are configured in a multi-dimensional cluster array. 10. The computer system of claim 9 wherein processor groups along a particular dimension of the cluster array have edge processor ports coupled to corresponding same dimension network switches. 11. The computer system of claim 10 wherein a plurality of network switches along a particular dimension are coupled together through a plurality of intermediate network switches. 12. The computer system of claim 9 wherein: rows of processor groups in the cluster array have edge processor ports coupled to corresponding same row network switches, and columns of processor groups in the cluster array have edge processor ports coupled to corresponding same column network switches. 13. The computer system of claim 12 wherein: row network switches are coupled together through one or more intermediate row network switches, and column network switches are coupled together through one or more intermediate column network switches. 14. The computer system of claim 12 wherein one or more intermediate row network switches are coupled to one or more intermediate column network switches. 15. The computer system of claim 1 wherein the plurality of network switches are directly coupled together. 16. The computer system of claim 1 wherein the plurality of network switches are coupled to one or more intermediate network switches. 17. The computer system of claim 16 wherein the plurality of network switches and the one or more intermediate network switches form a two-tier switching network. 18. The computer system of claim 16 wherein each of the plurality of network switches are coupled to a plurality of intermediate network switches to couple processors in the subsets of the plurality of processor groups in series. 19. The computer system of claim 16 wherein processors along a first dimension of the plurality of processor groups are coupled to the plurality of the network switches. 20. The computer system of claim 19 wherein the first dimension is a row of processors. 21. The computer system of claim 19 wherein the first dimension is a column of processors. 22. The computer system of claim 1 wherein the neural network computations are computations for training a neural network. 23. The computer system of claim 1 wherein processors configured to form the sequence are configured to process a plurality of partial layers of the partitioned neural network. 24. The computer system of claim 1 wherein processors configured to form the sequence are configured to process one or more pipeline stages of the partitioned neural network. 25. The computer system of claim 1 wherein processors configured to form the sequence are configured to adjust weights across a plurality of instances of the neural network. 26. A computer system comprising: a plurality of processor groups, the processor groups comprising a plurality of series configured processors to process a partitioned neural network, wherein different processors in the plurality of processor groups process one or more different layers, stages, or model instances of the partitioned neural network; a plurality of network switches, wherein the plurality of the network switches are coupled to subsets of the plurality of processor groups through edge processors of the series configured processors in each processor group; and a plurality of intermediate network switches coupled to subsets of the plurality of network switches, wherein at least a subset of the processors in the processor groups, one or more of the plurality of network switches, and one or more of the intermediate network switches are configured in sequences such that each processor communicates data for said layers, stages, or model instances with at least two adjacent processor in the sequence to perform Hamiltonian cycles for one or more of: data parallelism neural network computations, pipeline parallelism neural network computations, and layer parallelism neural network computations. 27. A method for processing a neural network, the method comprising: configuring a plurality of processors arranged in a plurality of processor groups to perform neural network computations on a partitioned neural network, wherein at least a subset of the processors in the processor groups are configured in series to form sequences of processors, and wherein different processors in the plurality of processor groups process one or more different layers, stages, or model instances of the partitioned neural network; configuring a plurality of network switches to coupled subsets of the plurality of processor groups together through edge processors of the series configured processors in each processor group to form at least one sequence of pr

Assignees

Inventors

Classifications

  • Supervised learning · CPC title

  • Feedforward networks · CPC title

  • Distributed learning, e.g. federated learning · CPC title

  • G06N3/063Primary

    using electronic means · CPC title

  • Three dimensional, e.g. hypercubes · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11580388B2 cover?
Embodiments of the present disclosure include techniques for processing neural networks. Various forms of parallelism may be implemented using topology that combines sequences of processors. In one embodiment, the present disclosure includes a computer system comprising a plurality of processor groups, the processor groups each comprising a plurality of processors. A plurality of network switch…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06N3/063. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 14 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 11 related publications on this page (citations in our corpus or others sharing the same primary CPC).