Systems and methods of determining compromised identity information
US-10599872-B2 · Mar 24, 2020 · US
US11928245B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11928245-B2 |
| Application number | US-202318097117-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 13, 2023 |
| Priority date | Dec 4, 2015 |
| Publication date | Mar 12, 2024 |
| Grant date | Mar 12, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A compromised data exchange system extracts data from websites using a crawler, detects portions within the extracted data that resemble personally identifying information (PII) data based on PII data patterns using a risk assessment module, and compares a detected portion to data within a database of disassociated compromised PII data to determine a match using the risk assessment module. A risk score may be assigned to a data item within the database in response to determining the match. In some embodiments, URL data may also be detected in the extracted data. The detected URL data represents further websites that can be automatically crawled by the system to detect further PII data.
Opening claim text (preview).
What is claimed is: 1. A compromised data exchange system, comprising: a network interface; one or more processors; and a memory coupled with the one or more processors, the memory storing instructions thereon that, when executed, cause the one or more processors to: identify an initial set of one of more websites that are to be accessed and scraped; access a first website of the initial set of one or more websites; extract data from each web site of the initial set of one of more websites; analyze the extracted data to identity PII data; compare the PII data to compromised PII data; identify potential URL data for an additional set of one or more websites; determine a number of sets of PII data that are identified at each website of the initial set of one or more websites; and rank each website within initial set based on the number of sets of PII data for each website. 2. The compromised data exchange system of claim 1 , wherein the instructions further cause the one or more processors to: selectively assign a risk score to a data item within the extracted data using a risk scoring module based on a result of comparing the PII data to compromised PII data. 3. The compromised data exchange system of claim 1 , wherein: identifying the potential URL data comprises using pattern analytics to recognize one or more domain names. 4. The compromised data exchange system of claim 1 , wherein: data is extracted and analyzed from each website of the initial set of one or more websites as the respective web site is accessed. 5. The compromised data exchange system of claim 1 , wherein: analyzing the extracted data comprises analyzing data from all websites of the initial set of one or more websites after the data from all websites of the initial set of one or more websites is extracted. 6. The compromised data exchange system of claim 1 , wherein: extracting data from each website of the initial set of one of more websites comprises: extracting data from a first website of the initial set of one of more websites; determining that not all websites from initial set of one of more websites have been accessed; and extracting data from a next website of the initial set of one of more websites. 7. The compromised data exchange system of claim 1 , wherein the instructions further cause the one or more processors to: store the rank of each website of the initial set of one of more websites. 8. A method of analyzing compromised data, comprising: identifying an initial set of one of more websites that are to be accessed and scraped; accessing a first website of the initial set of one or more websites; extracting data from each website of the initial set of one of more websites; analyzing the extracted data to identity PII data; comparing the PII data to compromised PII data; identifying potential URL data for an additional set of one or more websites; determining a number of sets of PII data that are identified at each website of the initial set of one or more websites; and ranking each website within initial set based on the number of sets of PII data for each website. 9. The method of analyzing compromised data of claim 8 , further comprising: accessing each website from the additional set of one or more websites based on a priority rank of a respective website of the initial set of one or more websites from which a particular item of URL data was identified. 10. The method of analyzing compromised data of claim 9 , further comprising: determining a number of sets of PII data that are identified at each website of the additional set of one or more websites; and re-ranking each website in the initial set of one or more websites and the additional set of one or more websites based on the number of sets of PII data for each website. 11. The method of analyzing compromised data of claim 8 , further comprising: analysis of the extracted data websites is not performed for websites falling below a predetermined ranking. 12. The method of analyzing compromised data of claim 8 , further comprising: selectively assigning a risk score to each data item within the extracted data using a risk scoring module based on a result of comparing the PII data to compromised PII data. 13. The method of analyzing compromised data of claim 12 , further comprising: increasing the risk score for a particular data item in response to determining that the PII data of the particular data item matches an item of the compromised PII data. 14. The method of analyzing compromised data of claim 8 , further comprising: disassociating elements of the compromised PII data from a compromised entity and from each other; unencrypting the disassociated elements of the compromised PII data; and re-encrypting the disassociated elements of the compromised PII data to produce re-encrypted PII data, wherein re-encrypting the PII data includes independently encrypting each data filed using a different encryption key. 15. A non-transitory computer-readable medium having instructions stored thereon that, when executed by one or more processors, cause the one or more processors to: identify an initial set of one of more websites that are to be accessed and scraped; access a first website of the initial set of one or more websites; extract data from each website of the initial set of one of more websites; analyze the extracted data to identity PII data; compare the PII data to compromised PII data; identify potential URL data for an additional set of one or more websites; determine a number of sets of PII data that are identified at each website of the initial set of one or more websites; and rank each website within initial set based on the number of sets of PII data for each website. 16. The non-transitory computer-readable medium of claim 15 , wherein the instructions further cause the one or more processors to: selectively assign a risk score to a data item associated with the extracted data using a risk scoring module based on a result of comparing the PII data to compromised PII data, wherein the risk score reflects a probability that an element of the compromised PII data may be misused. 17. The non-transitory computer-readable medium of claim 15 , wherein the instructions further cause the one or more processors to: disassociate elements within the compromised PII data from a compromised entity and from each other. 18. The non-transitory computer-readable medium of claim 15 , wherein the instructions further cause the one or more processors to: select at least one script configured to interact with a selected website of the initial set of one or more websites; render the at least one script using a rendering engine to access the selected website of the initial set of one or more websites to extract data; and index and store the extracted data in a database of scraped data. 19. The non-transitory computer-readable medium of claim 15 , wherein the instructions further cause the one or more processors to: store identified potential URL data found in each website of the initial set of one or more websites. 20. The non-transitory computer-readable medium of claim 15 , wherein: analyzing the extracted data comprises processing the extracted data to identify portions that include patterns of numbers resembling at least one of a social security number, a phone number, a birth date, a driver's license number, and an account number; and flagging the identified portions for further processing.
during internet communication, e.g. revealing personal data from cookies · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.