Error Correction in Computation

US2019332467A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2019332467-A1
Application numberUS-201816475297-A
CountryUS
Kind codeA1
Filing dateJan 10, 2018
Priority dateJan 11, 2017
Publication dateOct 31, 2019
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Introduced here is a technique to detect and/or correct errors in computation. The ability to correct errors in computation can increase the speed of the processor, reduce the power consumption of the processor, and reduce the distance between the transistors within the processor because the errors thus generated can be detected and corrected. In one embodiment, an error correcting module, running either in software or in hardware, can detect an error in matrix multiplication, by calculating an expected sum of all elements in the resulting matrix, and an actual sum of all elements in the resulting matrix. When there is a difference between the expected sum and the resulting sum, the error correcting module detects an error. In another embodiment, in addition to detecting the error, the error correcting module can determine the location and the magnitude of the error, thus correcting the erroneous computation.

First claim

Opening claim text (preview).

1 . An apparatus comprising: a computing device to multiply a first matrix and a second matrix to obtain a resulting matrix; and a non-transitory computer-readable medium storing instructions, the instructions when executed by a processor cause the processor to: detect a location and a magnitude of an error in the resulting matrix by performing a number of multiplication computations, wherein the number of multiplication computations is fewer than a number of multiplication computations involved in multiplying the first matrix and the second matrix; and correct the error in the resulting matrix based on the location and the magnitude of the error. 2 . The apparatus of claim 1 , wherein the location and the magnitude of the error is detected by: calculating a plurality of expected results for a plurality of items of the resulting matrix based on a corresponding plurality of items of the first matrix and a corresponding plurality of items of the second matrix; calculating a plurality of actual results for the plurality of items of the resulting matrix based on the resulting matrix; and detecting the location and magnitude of the error responsive to an expected result in the plurality of expected results differing from a corresponding actual result in the plurality of actual results. 3 . The apparatus of claim 2 , wherein the magnitude of the error is determined to be a difference between the expected result in the plurality of expected results and the corresponding actual result in the plurality of actual results. 4 . The apparatus of claim 2 , wherein the location of the error is detected by detecting a column of the error and detecting a row of the error based on the expected results and the actual results. 5 . The apparatus of claim 1 , wherein the instructions cause the processor to: monitor an error rate associated with the error; and responsive to the error rate being above a predefined threshold, generate a notification to change the computing device. 6 . The apparatus of claim 1 , wherein the instructions cause the processor to: determine a computing unit of the computing device producing the error; and increase voltage input into the computing unit. 7 . The apparatus of claim 1 , wherein the instructions cause the processor to: monitor an error rate associated with the error; and responsive to the error rate being above a predefined threshold, dynamically adjust a voltage input into the computing device. 8 . The apparatus of claim 1 , wherein the instructions cause the processor to: cause the computing device to repeat multiplication of the first matrix and the second matrix responsive to detection of the error. 9 . The apparatus of claim 1 , the computing device to, responsive to detection of the error: permute a first group of elements in the first matrix, and a second group of elements in the second matrix; multiply the permuted first matrix and the permuted second matrix to obtain a permuted resulting matrix; and permute a group of elements in the permuted resulting matrix to obtain the resulting matrix. 10 . A method comprising: multiplying, by a computing device, a first matrix and a second matrix to obtain a resulting matrix; detecting a location and a magnitude of an error in the resulting matrix by performing a number of multiplication computations, wherein the number of multiplication computations is fewer than a number of multiplication computations required to multiply the first matrix and the second matrix; and correcting the error in the resulting matrix based on the location and the magnitude of the error. 11 . The method of claim 10 , wherein detecting the location and the magnitude of the error comprises: calculating a plurality of expected results for a plurality of items of the resulting matrix based on a corresponding plurality of items of the first matrix and a corresponding plurality of items of the second matrix; calculating a plurality of actual results for the plurality of items of the resulting matrix based on the resulting matrix; and detecting the location and magnitude of the error responsive to an expected result in the plurality of expected results differing from a corresponding actual result in the plurality of actual results. 12 . The method of claim 11 , wherein the magnitude of the error is determined to be a difference between the expected result in the plurality of expected results and the corresponding actual result in the plurality of actual results. 13 . The method of claim 11 , wherein detecting the location of the error comprises detecting a column of the error and detecting a row of the error based on the expected results and the actual results. 14 . The method of claim 10 , further comprising: monitoring an error rate associated with the error; and responsive to the error rate being above a predefined threshold, generating a notification to change the computing device. 15 . The method of claim 10 , further comprising: determining a computing unit of the computing device producing the error; and increasing voltage input into the computing unit. 16 . The method of claim 10 , further comprising: monitoring an error rate associated with the error; and responsive to the error rate being above a predefined threshold, dynamically adjusting a voltage input into the computing device. 17 . The method of claim 10 , further comprising: repeating multiplication of the first matrix and the second matrix responsive to detection of the error. 18 . The method of claim 10 , further comprising, responsive to detection of the error: permuting a first group of elements in the first matrix, and a second group of elements in the second matrix; multiplying the permuted first matrix and the permuted second matrix by the computing device to obtain a permuted resulting matrix; and permuting a group of elements in the permuted resulting matrix to obtain the resulting matrix. 19 . An apparatus comprising: a computing device to multiply a first matrix and a second matrix to obtain a resulting matrix; and an error correcting circuit to detect a location and a magnitude of an error in the resulting matrix by performing a number of multiplication computations, wherein the number of multiplication computations is fewer than a number of multiplication computations involved in multiplying the first matrix and the second matrix, the error correcting circuit to correct the error in the resulting matrix based on the location and the magnitude of the error. 20 . The apparatus of claim 19 , wherein the error correcting circuit detects the location of the magnitude of the error by: calculating a plurality of expected results for a plurality of items of the resulting matrix based on a corresponding plurality of items of the first matrix and a corresponding plurality of items of the second matrix; calculating a plurality of actual results for the plurality of items of the resulting matrix based on the resulting matrix; and detecting the location and magnitude of the error responsive to an expected result in the plurality of expected results differing from a corresponding actual result in the plurality of actual results.

Assignees

Inventors

Classifications

  • Remedial or corrective actions (recovery from an exception in an instruction pipeline G06F9/3861; by retry G06F11/1402; for recovering from a failure of a protocol instance or entity H04L69/40) · CPC title

  • Methods or arrangements for performing computations using a digital non-denominational number representation, i.e. number representation without radix; Computing devices using combinations of denominational and non-denominational quantity representations {, e.g. using difunction pulse trains, STEELE computers, phase computers (conversion of digital data to or from non-denominational form H03M5/00, H03M7/00)} · CPC title

  • G06F7/48Primary

    using non-contact-making devices, e.g. tube, solid state device; using unspecified devices · CPC title

  • Exception handling · CPC title

  • Root cause analysis, i.e. error or fault diagnosis (in a hardware test environment G06F11/22; in a software test environment G06F11/36) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2019332467A1 cover?
Introduced here is a technique to detect and/or correct errors in computation. The ability to correct errors in computation can increase the speed of the processor, reduce the power consumption of the processor, and reduce the distance between the transistors within the processor because the errors thus generated can be detected and corrected. In one embodiment, an error correcting module, runn…
Who is the assignee on this patent?
Groq Inc
What technology area does this patent fall under?
Primary CPC classification G06F7/48. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Oct 31 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).