Systems and methods for securing data based on discovered relationships
US-2020125746-A1 · Apr 23, 2020 · US
US11664998B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11664998-B2 |
| Application number | US-202016884728-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 27, 2020 |
| Priority date | May 27, 2020 |
| Publication date | May 30, 2023 |
| Grant date | May 30, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Described are techniques for preserving data security for sensitive information. The techniques including identifying sensitive information in first audio data from a first client device. The techniques further comprise generating second audio data including hashed sensitive information, where the hashed sensitive information comprises an audio clip that replaces the sensitive information and that is based on the sensitive information. The techniques further comprise transmitting the second data including the hashed sensitive information to a second client device. The techniques further comprise receiving third audio data including the hashed sensitive information from the second client device. The techniques further comprise generating fourth audio data by replacing the hashed sensitive information with the sensitive information and transmitting the fourth audio data including the sensitive information to the first client device.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method comprising: identifying sensitive information in first audio data from a first client device; generating second audio data including hashed sensitive information, wherein the hashed sensitive information comprises an audio clip that replaces the sensitive information, that is based on the sensitive information, and that retains linguistic characteristics of the sensitive information, wherein the linguistic characteristics are selected from a group consisting of: phonetic characteristics, syntactic characteristics, and semantic characteristics; transmitting the second audio data including the hashed sensitive information to a second client device; receiving third audio data including the hashed sensitive information from the second client device; generating fourth audio data by replacing the hashed sensitive information with the sensitive information; and transmitting the fourth audio data including the sensitive information to the first client device. 2. The method of claim 1 , wherein identifying the sensitive information in the first audio data further comprises: comparing extracted portions of the first audio data to a sensitive information database; and classifying respective extracted portions matching a respective entry in the sensitive information database as the sensitive information. 3. The method of claim 1 , wherein identifying the sensitive information in the first audio data further comprises: determining that an extracted portion of the first audio data does not match any record in a sensitive information database; generating a sensitivity score for the extracted portion of the first audio data in response to determining that the extracted portion of the first audio data does not match any record in the sensitive information database; determining that the sensitivity score satisfies a sensitivity score threshold; and classifying the extracted portion of the first audio data as the sensitive information. 4. The method of claim 3 , wherein the sensitivity score is generated by a content sensitivity model that is trained using machine learning algorithms. 5. The method of claim 1 , wherein generating the second audio data including the hashed sensitive information further comprises storing a correspondence between the sensitive information and the hashed sensitive information in a mapping table; and wherein generating fourth audio data by replacing the hashed sensitive information with the sensitive information further comprises matching the hashed sensitive information with the sensitive information based on the correspondence in the mapping table. 6. The method of claim 1 , wherein the hashed sensitive information includes an indicator that identifies the hashed sensitive information as data with a sensitive information classification. 7. The method of claim 6 , wherein the indicator further includes an explanation of the sensitive information classification, wherein the explanation relates to a match in a sensitive information database. 8. The method of claim 7 , wherein the method further comprises: receiving feedback related to an accuracy of the sensitive information classification; and updating, based on the feedback, the sensitive information database. 9. The method of claim 6 , wherein the indicator further includes an explanation of the sensitive information classification, wherein the explanation relates to a sensitivity score generated by a sensitivity score model above a sensitivity score threshold. 10. The method of claim 9 , wherein the method further comprises: receiving feedback related to an accuracy of the sensitive information classification; and updating, based on the feedback, the sensitivity score model. 11. The computer-implemented method of claim 1 , wherein the method is performed by a data security application according to software that is downloaded to the data security application from a remote data processing system. 12. The computer-implemented method of claim 11 , wherein the method further comprises: metering a usage of the software; and generating an invoice based on metering the usage. 13. The method of claim 1 , wherein the linguistic characteristics comprise the phonetic characteristics. 14. The method of claim 1 , wherein the linguistic characteristics comprise the syntactic characteristics. 15. The method of claim 1 , wherein the linguistic characteristics comprise the semantic characteristics. 16. A system comprising: one or more processors; and one or more computer-readable storage media storing program instructions which, when executed by the one or more processors, are configured to cause the one or more processors to perform a method comprising: identifying sensitive information in first audio data from a first client device; generating second audio data including hashed sensitive information, wherein the hashed sensitive information comprises an audio clip that replaces the sensitive information, that is based on the sensitive information, and that retains linguistic characteristics of the sensitive information, wherein the linguistic characteristics are selected from a group consisting of: phonetic characteristics, syntactic characteristics, and semantic characteristics; transmitting the second audio data including the hashed sensitive information to a second client device; receiving third audio data including the hashed sensitive information from the second client device; generating fourth audio data by replacing the hashed sensitive information with the sensitive information; and transmitting the fourth audio data including the sensitive information to the first client device. 17. The system of claim 16 , wherein identifying the sensitive information in the first audio data further comprises: comparing extracted portions of the first audio data to a sensitive information database; and classifying respective extracted portions matching a respective entry in the sensitive information database as the sensitive information. 18. The system of claim 16 , wherein identifying the sensitive information in the first audio data further comprises: determining that an extracted portion of the first audio data does not match any record in a sensitive information database; generating, in response to determining that the extracted portion of the first audio data does not match any record in the sensitive information database, a sensitivity score for the extracted portion of the first audio data based on inputting the extracted portion of the first audio data to a content sensitivity model that is trained using machine learning algorithms; determining that the sensitivity score satisfies a sensitivity score threshold; and classifying the extracted portion of the first audio data as the sensitive information. 19. The system of claim 16 , wherein generating the second audio data including the hashed sensitive information further comprises storing a correspondence between the sensitive information and the hashed sensitive information in a mapping table; and wherein generating fourth audio data by replacing the hashed sensitive information with the sensitive information further comprises matching the hashed sensitive information with the sensitive information based on the correspondence in the mapping table. 20. A computer program product comprising one or more computer readable storage media, and program instructions collectively stored on the one or more computer readable
Machine learning · CPC title
Hash functions, e.g. MD5, SHA, HMAC or f9 MAC · CPC title
using cryptographic hash functions · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.