Hashing using data parallel instructions

US10491377B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10491377-B2
Application numberUS-201715444663-A
CountryUS
Kind codeB2
Filing dateFeb 28, 2017
Priority dateFeb 28, 2017
Publication dateNov 26, 2019
Grant dateNov 26, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods generate reasonably secure hash values at relatively few CPU cycles per byte. An example method includes, for each of a plurality of packets, injecting the packet into an internal state that represents an internal hash sum, mixing the internal state using multiplication, and shuffling the result of the multiplication so that bytes with highest quality are moved to locations that will propagate most widely in a next multiplication operation. Each of the plurality of packets include data from an input to be hashed. In some implementation, a last packet for the input is padded. The method may also include further mixing the internal state using multiplication after processing the plurality of packets and providing, to a requesting process, a portion of the final internal state as a hash of the input.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: for each of a plurality of packets, each packet including data from an input to be hashed: injecting the packet into an internal state that represents an internal hash sum, mixing the internal state using multiplication, and shuffling a result of the multiplication so that bytes with highest quality are moved to locations that will propagate most widely in a next multiplication operation; further mixing the internal state using multiplication after processing the plurality of packets; and providing, to a requesting process, a portion of the further mixed internal state as a hash of the input. 2. The method of claim 1 , wherein further mixing the internal state using multiplication includes: injecting a packet-sized portion of the internal state into the internal state; mixing the internal state using multiplication; and shuffling the result of the multiplication so that bytes with highest quality are permuted to locations that will propagate most widely in a next multiplication operation. 3. The method of claim 2 , wherein further mixing the internal state using multiplication includes repeating the injecting, mixing, and shuffling at least two additional times. 4. The method of claim 2 , wherein further mixing the internal state using multiplication also includes permuting the packet-sized portion of the internal state after shuffling the result. 5. The method of claim 1 , where the shuffling interleaves neighboring vector lanes of a single instruction, multiple data (SIMD) instruction so that at least some highest quality bytes from a first lane of the neighboring lanes and at least some highest quality bytes from a second lane of the neighboring lanes are moved to a location in the first lane. 6. The method of claim 1 , further comprising finalizing the internal state by reducing the size of the internal state by adding four equal-sized portions of the internal state together. 7. The method of claim 1 , the method further comprising: determining that the packet in not a multiple of 32 bytes; and determining a quantity of bytes to be padded; and injecting the quantity with the packet into the internal state. 8. A non-transitory computer-readable medium having code segments stored thereon, the code segments, when executed by a processor cause the processor to: for each of a plurality of packets, each packet including data from an input to be hashed: injecting the packet into an internal state that represents an internal hash sum, mixing the internal state using multiplication, and shuffling a result of the multiplication so that bytes with highest quality are moved to locations that will propagate most widely in a next multiplication operation; further mixing the internal state using multiplication after processing the plurality of packets; and providing, to a requesting process, a portion of the further mixed internal state as a hash of the input. 9. A computer system comprising: at least one processor; and memory storing instructions that, when executed by the at least one processor, causes the computer system to perform operations including: reading an input file into an input buffer; blocking the input buffer into packets; for each of the packets, updating an internal state using the packet by: injecting the packet into the internal state, mixing the internal state using multiplication, and shuffling a result of the multiplication so that bits with highest quality are permuted to locations that will propagate most widely in a next multiplication operation; updating the internal state with a packet-sized portion of the internal state by: injecting the packet-sized portion into the internal state, mixing the internal state using multiplication, and shuffling a result of the multiplication so that bytes with highest quality are permuted to locations that will propagate most widely in a next multiplication operation; finalizing the internal state; and storing a portion of the finalized internal state as a hash of the input file. 10. The system of claim 9 , where the mixing by multiplication is performed on 256 bits in parallel using four Single Instruction Multiple Data (SIMD) vector lanes. 11. The system of claim 10 , where the shuffling causes bits with higher quality to be moved to least significant byte positions of a SIMD vector lane and remaining bytes to be moved to most significant bytes positions of the SIMD vector lane. 12. The system of claim 10 , where the shuffling interleaves neighboring vector lanes so that at least some highest quality bytes from a first lane of the neighboring lanes and at least some highest quality bytes from a second lane of the neighboring lanes are moved to a location in the first lane. 13. The system of claim 9 , wherein an optimization algorithm controls the shuffling. 14. The system of claim 9 , where finalizing the internal state includes: reducing a size of the internal state by four by adding together four equal-sized portions of the internal state. 15. The system of claim 9 , wherein the portion stored as the hash is the entire final state. 16. The system of claim 9 , wherein the portion stored as the hash is a portion in a lower lane of a 4-lane Single Instruction Multiple Data (SIMD) processor. 17. The system of claim 9 , wherein the instructions cause the computer system to perform further operations including: determining that the input is not a multiple of 32 bytes; determining a quantity of bytes to be padded; and injecting the quantity with the packet into the internal state. 18. The system of claim 9 , wherein the instructions cause the computer system to perform further operations including: repeating the updating of the internal state with a packet-sized portion at least two additional times. 19. The system of claim 9 , wherein the instructions cause the computer system to perform further operations including, when the input file exceeds a size of the input buffer: repeating the reading, blocking, and updating of the internal state with each of the packets until reaching an end of the input file, wherein updating the internal state with the packet-sized portion occurs after reaching the end of the input file. 20. The system of claim 9 , wherein when the input is not a multiple of 32 bytes, the instructions cause the computer system to perform further operations including: determining a value of a byte at a particular position; and setting a value of each of the bytes to be padded to the value of the byte at the particular position.

Assignees

Inventors

Classifications

  • H04L9/0643Primary

    Hash functions, e.g. MD5, SHA, HMAC or f9 MAC · CPC title

  • Pseudo-random number generators · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10491377B2 cover?
Systems and methods generate reasonably secure hash values at relatively few CPU cycles per byte. An example method includes, for each of a plurality of packets, injecting the packet into an internal state that represents an internal hash sum, mixing the internal state using multiplication, and shuffling the result of the multiplication so that bytes with highest quality are moved to locations …
Who is the assignee on this patent?
Google Inc, Google Llc
What technology area does this patent fall under?
Primary CPC classification H04L9/0643. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Nov 26 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).