Adding cooperative file coloring protocols in a data deduplication system

US11048594B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11048594-B2
Application numberUS-201715791604-A
CountryUS
Kind codeB2
Filing dateOct 24, 2017
Priority dateAug 21, 2013
Publication dateJun 29, 2021
Grant dateJun 29, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

For adding cooperative file coloring protocols in a data deduplication system using a processor device in a computing environment, a preferred character is represented for file coloring in a file using a code selected from a multiplicity of codes that represent a variety of contexts. The original meaning of the preferred character is retained when representing the preferred character for the file coloring by the code selected from the multiplicity of codes. The file is deduplicated by the data deduplication system according to the file coloring that represents a source file of a backup application.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for adding cooperative file coloring protocols in a data deduplication system using a processor device in a computing environment, comprising: representing, by the processor device of the data deduplication system, a preferred character for file coloring in a file of a backup application using a code selected from a plurality of codes that represent a plurality of contexts, the contexts including at least metadata, a file type, a file authorship, and a file ownership of the file; wherein an original meaning of the preferred character is retained when representing the preferred character for the file coloring by the code selected from the plurality of codes; and deduplicating the file by the data deduplication system according to the file coloring that represents a source file of the backup application, wherein the file coloring is used as a tie breaker for the deduplication operation when an input characteristic in input data matches an existing repository characteristic as a similarity search structure is searched for identifying similarity elements that most similarly matches the input data. 2. The method of claim 1 , further including embedding the file coloring of data chunks of the file in at least one of a plurality of data streams, wherein the file coloring includes at least one of a plurality of shapes, a plurality of colors for one of a plurality of servers, a plurality of file owners, and a plurality of applications. 3. The method of claim 1 , further including identifying similarities between the data chunks of the file using the file coloring. 4. The method of claim 1 , further including setting as the preferred character one of a most used character, a second most used character, and an nth most used character. 5. The method of claim 1 , further including performing one of: creating a class of related codes from the plurality of codes for file coloring, and selecting an alternative code from the plurality of codes if the selected code from the plurality of codes is unavailable. 6. The method of claim 1 , further including performing one of: representing each instance of the preferred character in the file for the file coloring, wherein the cooperative file coloring protocols are established between backup applications and data deduplication systems, and restoring the preferred character to the original meaning by removing the file coloring. 7. The method of claim 1 , further including identifying data chunks of the file sent across a plurality of data streams by the file coloring. 8. The method of claim 1 , further including tagging the similarity elements generated from a deduplication operation and that are stored in the similarity search structure with the file coloring. 9. A system for adding cooperative file coloring protocols in a data deduplication system of a computing environment, the system comprising: the data deduplication system; a repository in the data deduplication system; a similarity search structure in association with the repository and the data deduplication system; and at least one processor device operable in the computing environment for controlling the data deduplication system, wherein the at least one processor device: represents, by the processor device of the data deduplication system, a preferred character for file coloring in a file of a backup application using a code selected from a plurality of codes that represent a plurality of contexts, the contexts including at least metadata, a file type, a file authorship, and a file ownership of the file; wherein an original meaning of the preferred character is retained when representing the preferred character for the file coloring by the code selected from the plurality of codes, and deduplicates the file by the data deduplication system according to the file coloring that represents a source file of the backup application, wherein the file coloring is used as a tie breaker for the deduplication operation when an input characteristic in input data matches an existing repository characteristic as a similarity search structure is searched for identifying similarity elements that most similarly matches the input data. 10. The system of claim 9 , wherein the at least one processor device embeds the file coloring of data chunks of the file in at least one of a plurality of data streams, wherein the file coloring includes at least one of a plurality of shapes, a plurality of colors for one of a plurality of servers, a plurality of file owners, and a plurality of applications. 11. The system of claim 9 , wherein the at least one processor device identifies similarities between the data chunks of the file using the file coloring. 12. The system of claim 9 , wherein the at least one processor device sets as the preferred character one of a most used character, a second most used character, and an nth most used character. 13. The system of claim 9 , wherein the at least one processor device performs one of: creating a class of related codes from the plurality of codes for file coloring, and selecting an alternative code from the plurality of codes if the selected code from the plurality of codes is unavailable. 14. The system of claim 9 , wherein the at least one processor device performs one of: representing each instance of the preferred character in the file for the file coloring, wherein the cooperative file coloring protocols are established between backup applications and data deduplication systems, and restoring the preferred character to the original meaning by removing the file coloring. 15. The system of claim 9 , wherein the at least one processor device identifies data chunks of the file sent across a plurality of data streams by the file coloring. 16. The system of claim 9 , wherein the at least one processor device tags the similarity elements generated from a deduplication operation and that are stored in the similarity search structure with the file coloring. 17. A computer program product for adding cooperative file coloring protocols in a data deduplication using a processor device in a computing environment, the computer program product comprising a non-transitory computer-readable storage medium having computer-readable program code portions stored therein, the computer-readable program code portions comprising: an executable portion that represents, by the processor device of the data deduplication system, a preferred character for file coloring in a file of a backup application using a code selected from a plurality of codes that represent a plurality of contexts, the contexts including at least metadata, a file type, a file authorship, and a file ownership of the file; wherein an original meaning of the preferred character is retained when representing the preferred character for the file coloring by the code selected from the plurality of codes; and an executable portion that deduplicates the file by the data deduplication system according to the file coloring that represents a source file of the backup application, wherein the file coloring is used as a tie breaker for the deduplication operation when an input characteristic in input data matches an existing repository characteristic as a similarity search structure is searched for identifying similarity elements that most similarly matches the input data. 18. The computer program product of claim 17 , further including an executable portion that embeds the file coloring of data chunks of the file in at least one of a plurality of data streams, wherein

Assignees

Inventors

Classifications

  • using de-duplication of the data · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11048594B2 cover?
For adding cooperative file coloring protocols in a data deduplication system using a processor device in a computing environment, a preferred character is represented for file coloring in a file using a code selected from a multiplicity of codes that represent a variety of contexts. The original meaning of the preferred character is retained when representing the preferred character for the fi…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F11/1453. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 29 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).