Selection of data compression technique based on input characteristics
US-2019207624-A1 · Jul 4, 2019 · US
US10680643B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10680643-B2 |
| Application number | US-201916297579-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 8, 2019 |
| Priority date | Dec 14, 2018 |
| Publication date | Jun 9, 2020 |
| Grant date | Jun 9, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In connection with compression of an input stream, multiple portions of the input stream are searched against previously received portions of the input stream to find any matches of character strings in the previously received portions of the input stream. In some cases, matches of longer character strings, as opposed to shorter character strings, can be selected for inclusion in an encoded stream that is to be compressed. Delayed selection can occur whereby among multiple matches, a match that is longer can be selected for inclusion in the encoded stream and non-selected a character string match is reverted to a literal. A search engine that is searching an input stream to identify a repeat pattern of characters can cease to search for characters that were included in the selected character string match.
Opening claim text (preview).
What is claimed is: 1. An apparatus comprising: a memory and at least one processor, the at least one processor to: receive an input stream comprising first, second, and third portions; search the third portion of the input stream to find a first match with the first portion of the input stream; indicate a size of the first match between the third portion and the first portion; search the third portion of the input stream to find a second match with the second portion of the input stream; indicate a size of the second match between the third portion and the second portion; and select a longer of the first match and the second match as a candidate for inclusion in an encoded stream. 2. The apparatus of claim 1 , wherein the input stream comprises a fourth portion and the at least one processor is to: search the third portion of the input stream to attempt to find a third match with the fourth portion of the input stream; select the first match based on the first match being longer than the second match; and cause the attempt to find the third match to stop based on the fourth portion including a portion of the first match. 3. The apparatus of claim 1 , wherein the input stream comprises a fourth portion and the at least one processor is to: search the third portion of the input stream to find a third match with the fourth portion of the input stream; select the first match based on the first match being longer than the second match; and cause the third match to be not considered for inclusion in the encoded stream based on the fourth portion including a portion of the first match. 4. The apparatus of claim 1 , wherein: the first portion is associated with a first character and first position in the input stream; the second portion is associated with a second character and second position in the input stream; and the at least one processor is to: select the second match based on the second match being longer than the first match, provide the first character for inclusion in the encoded stream, and hold the second match for comparison with one or more other character string matches. 5. The apparatus of claim 1 , wherein: the input stream comprises a fourth portion; the first portion is associated with a first character and first position in the input stream; the second portion is associated with a second character and second position in the input stream; the fourth portion is associated with a third character and third position in the input stream; and the at least one processor is to: search the third portion of the input stream to find a third match with the fourth portion of the input stream; select the third match based on the third match being a longest among the first match, the second match, and the third match; provide the first character for inclusion in the encoded stream; provide the second character for inclusion in the encoded stream; and hold the third match for comparison with one or more other character string matches. 6. The apparatus of claim 1 , wherein: the input stream comprises a fourth portion; the first portion is associated with a first character and first position in the input stream; the second portion is associated with a second character and second position in the input stream; the fourth portion is associated with a third character and third position in the input stream; the at least one processor is to: search the third portion of the input stream to find a third match with the fourth portion of the input stream; select the third match based on a length of the third match being at least two longer than a length of the first match and longer than a length of the second match; provide the first character for inclusion in the encoded stream; provide the second character for inclusion in the encoded stream; and hold the third match for comparison with one or more other character string matches. 7. The apparatus of claim 1 , wherein the encoded stream comprises an LZ77 compliant stream. 8. The apparatus of claim 1 , wherein the at least one processor is to: compress the encoded stream based on properties of the encoded stream. 9. The apparatus of claim 1 , comprising one or more of: a network interface, central processing unit, or an offload engine. 10. A method comprising: searching for a first match in a data stream with a first portion of data; searching for a second match in the data stream with a second portion of data; identifying the first match in the data stream with the first portion, wherein the first portion comprises multiple characters; and causing searching for the second match in the data stream to search a portion of the data stream beginning after the first portion. 11. The method of claim 10 , comprising halting searching for the second match in the data stream with the second portion of data in response to identifying the first match in the data stream with the first portion. 12. The method of claim 10 , comprising: determining whether to include the first match in an encoded stream based on a length of the first match and a length of at least one other match. 13. The method of claim 12 , wherein the first match comprises multiple characters and the at least one other match comprises a literal and comprising: selecting the first match for inclusion in the encoded stream based on the first match being longer in length than the at least one other match. 14. The method of claim 12 , wherein the first match comprises multiple characters and the at least one other match comprises a third match and a fourth match and comprising: the third match is associated with a first character and first position in the data stream and the third match comprises multiple characters; the fourth match is associated with a second character and second position in the data stream and the fourth match comprises multiple characters; providing the first character for inclusion in the encoded stream; providing the second character for inclusion in the encoded stream; and holding the first match for comparison with at least one other match. 15. The method of claim 10 , comprising generating an encoded stream based on the first match and a compression scheme. 16. The method of claim 15 , wherein the compression scheme comprises one or more of: LZ4, LZ4s, iLZ77, LZS, Zstandard, DEFLATE, Huffman coding, Snappy standards, or no compression. 17. A system comprising: a network interface; a memory; and at least one processor communicatively coupled to the network interface and the memory, the at least one processor to: receive a first match between a first character string and a portion of an input stream, receive a second match between a second character string and a second portion of the input stream, select a match from a longer of the first match and the second match, and compare the selected match with at least one other match to determine whether to include the selected match in an encoded stream. 18. The system of claim 17 , wherein: the second match is selected based on being longer than the first match, the first match is associated with at least one character and a first position in the input stream, and the at least one processor is to provide the at least one character for inclusion in the encoded stream. 19. The system of claim 17 , wherein the at least one processor is to discontinue search for a character string that overlaps with the selected match. 20. The system of 17 , wherein the n
Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code · CPC title
Context modeling · CPC title
employing a sliding window, e.g. LZ77 · CPC title
Conversion of the form of the representation of individual digits · CPC title
Pipelining · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.