What technology area does this patent fall under?

Primary CPC classification G06F21/6245. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 27 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Data detection using intelligent sampling

US12536323B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12536323-B2
Application number	US-202318467484-A
Country	US
Kind code	B2
Filing date	Sep 14, 2023
Priority date	Sep 14, 2023
Publication date	Jan 27, 2026
Grant date	Jan 27, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method includes determining that a specific type of information is to be identified in a set of data. The method further includes sampling the set of data according to various sampling criteria to identify the specified type of information. The sampling criteria include at least a recency criterion indicating that the data to be sampled has been updated within a specified timeframe and a lineage criterion indicating that the data to be sampled is within a maximum hierarchical distance from a source data structure. The method also includes identifying, from the data that was sampled according to the sampling criteria, one or more data structures that include the specified type of information. The method further includes applying security policies to the identified data structures based on the type of information that was identified in the set of data. Various other methods, systems, and computer-readable media are also disclosed.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method comprising: determining that a specific type of information is to be identified in a set of data comprising a hierarchical structure; sampling the set of data according to one or more sampling criteria to identify the specified type of information, the sampling criteria including at least: a recency criterion specifying a timeframe, in which data has been updated, from which the data is to be sampled; and a lineage criterion specifying a maximum hierarchical distance, from a source data structure within the hierarchical structure, that the data is to be sampled within; from the data that was sampled according to the sampling criteria, identifying one or more data structures that include the specified type of information; and applying, to the identified data structures, one or more security policies that transform the identified data structures from a less secure state to a more secure state. 2 . The computer-implemented method of claim 1 , wherein the data structures that include the specified type of information are further classified according to one or more data classification rules. 3 . The computer-implemented method of claim 2 , wherein the data classification rules further define which data structures qualify as including the specified type of information. 4 . The computer-implemented method of claim 2 , wherein the data classification rules filter the data structures that include the specified type of information into one or more groups that include subtypes of the specified type of information. 5 . The computer-implemented method of claim 2 , wherein one or more of the data classification rules are defined by a user. 6 . The computer-implemented method of claim 1 , wherein the data set is randomly sampled according to at least the recency criterion and the lineage criterion until a statistically significant number of samples have been taken from the set of data. 7 . The computer-implemented method of claim 1 , wherein the one or more security policies comprise a policy to: encrypt the identified data structures; restrict access to the identified data structures; relocate the identified data structures; apply a label to the identified data structures; or quarantine the identified data structures. 8 . The computer-implemented method of claim 1 , wherein the lineage criterion indicates a relative importance of sampling the set of data. 9 . The computer-implemented method of claim 8 , wherein data that is hierarchically closer to the source data structure has a higher relative importance, and wherein data that is hierarchically further from the source data structure has a lower relative importance. 10 . The computer-implemented method of claim 1 , further comprising providing a recommendation to an owner or manager of the identified data structures indicating which data structures are identified as including the specified type of information. 11 . The computer-implemented method of claim 1 , wherein sampling is avoided for datasets that are outside of the specified timeframe. 12 . The computer-implemented method of claim 1 , wherein the specified type of information comprises personally identifiable information. 13 . A system comprising: at least one physical processor; and physical memory comprising computer-executable instructions that, when executed by the physical processor, cause the physical processor to: determine that a specific type of information is to be identified in a set of data comprising a hierarchical structure; sample the set of data according to one or more sampling criteria to identify the specified type of information, the sampling criteria including at least: a recency criterion specifying a timeframe, in which data has been updated, from which the data is to be sampled; and a lineage criterion specifying a maximum hierarchical distance, from a source data structure within the hierarchical structure, that the data is to be sampled within; from the data that was sampled according to the sampling criteria, identify one or more data structures that include the specified type of information; and apply, to the identified data structures, one or more security policies that transform the identified data structures from a less secure state to a more secure state. 14 . The system of claim 13 , wherein the lineage criterion is given higher weighting during the sampling, such that source data structures are prioritized when performing the sampling. 15 . The system of claim 13 , wherein identifying the one or more data structures that include the specified type of information comprises identifying at least one new subtype of the specified type of information. 16 . The system of claim 15 , wherein the at least one new subtype of the specified type of information is implemented as feedback when identifying other instances of the specified type of information. 17 . The system of claim 16 , wherein the feedback includes a mapping between the at least one newly identified subtype and the sampled data. 18 . The system of claim 17 , wherein one or more classification rules are automatically generated based on the mapping. 19 . The system of claim 18 , wherein the automatically generated classification rules are refined over time as new subtypes of the specified type of information are identified in the set of data or in other sets of data. 20 . A non-transitory computer-readable medium comprising one or more computer-executable instructions that, when executed by at least one processor of a computing device, cause the computing device to: determine that a specific type of information is to be identified in a set of data comprising a hierarchical structure; sample the set of data according to one or more sampling criteria to identify the specified type of information, the sampling criteria including at least: a recency criterion specifying a timeframe, in which data has been updated, from which the data is to be sampled; and a lineage criterion specifying a maximum hierarchical distance, from a source data structure within the hierarchical structure, that the data is to be sampled within; from the data that was sampled according to the sampling criteria, identify one or more data structures that include the specified type of information; and apply, to the identified data structures, one or more security policies that transform the identified data structures from a less secure state to a more secure state.

Assignees

Netflix Inc

Inventors

Classifications

G06F21/6245Primary
Protecting personal data, e.g. for financial or medical purposes · CPC title
G06N5/025
Extracting rules from data · CPC title
G06F16/23
Updating · CPC title
G06F21/60
Protecting data · CPC title
H04W12/02
Protecting privacy or anonymity, e.g. protecting personally identifiable information [PII] · CPC title

Patent family

Related publications grouped by family.

View patent family 92899949

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12536323B2 cover?: A computer-implemented method includes determining that a specific type of information is to be identified in a set of data. The method further includes sampling the set of data according to various sampling criteria to identify the specified type of information. The sampling criteria include at least a recency criterion indicating that the data to be sampled has been updated within a specified…
Who is the assignee on this patent?: Netflix Inc
What technology area does this patent fall under?: Primary CPC classification G06F21/6245. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 27 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Scanning for information according to scan objectives

Sensitive data classification

Computing platform security methods and apparatus

Frequently asked questions