System and method for detecting sensitivity content in time-series data

US10268836B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10268836-B2
Application numberUS-201514618280-A
CountryUS
Kind codeB2
Filing dateFeb 10, 2015
Priority dateMar 14, 2014
Publication dateApr 23, 2019
Grant dateApr 23, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system and method for detecting sensitivity content in time-series data is disclosed. The method comprises receiving the time-series data from a source. The data is received for one or more instances. The method further comprises detecting the sensitivity content in the time-series data. The sensitivity content indicates presence of an anomaly. The detecting comprises determining a kurtosis value corresponding to the time-series data. The detecting further comprises comparing the kurtosis value with a reference value. The detecting further comprises processing the data using a first filtering means or a second filtering means. The first filtering means is used when the data distribution of the time-series data is either of a platykurtic distribution or a mesokurtic distribution. The second filtering means is used when the data distribution of the time-series data is a leptokurtic distribution.

First claim

Opening claim text (preview).

We claim: 1. A method for detecting sensitivity content in time-series data, the method comprising: receiving, by a processor in a server, the time-series data from a source in real-time, wherein the time-series data is received for one or more instances, and wherein an instance of the one or more instances is associated with a value of the time-series data, and wherein the source comprises one or more sensors, wherein the one or more sensors measure the value of the time-series data and convert the value into one or more signals, wherein the one or more sensors include one or more wireless sensors to transmit the time-series data to the server in real-time; detecting, by the processor, the sensitivity content in the time-series data, wherein the sensitivity content indicates presence of an anomaly and the sensitivity content is defined as a minute statistical anomaly indicative of existence of private information in the time-series data, wherein the sensitivity content is detected for a value corresponding to each instance from the one or more instances in the time-series data, wherein the detecting comprises: determining a kurtosis value corresponding to each instance of the one or more instances associated with the value of the time-series data in a data distribution of the time-series data, wherein the value of the time-series data includes a plurality of time stamps associated with the one or more instances; comparing the kurtosis value with a reference value; determining a data distribution of the time-series data based upon the comparison, wherein the data distribution is one of a platykurtic distribution when the kurtosis value is less than the reference value, a mesokurtic distribution when the kurtosis value is equal to the reference value, and a leptokurtic distribution when the kurtosis value is greater than the reference value; processing the time-series data using a Hampel filter and a median-based Rosner filter, wherein the Hampel filter is used when the data distribution of the time-series data is either of the platykurtic distribution or the mesokurtic distribution, and wherein the median-based Rosner filter is used when the data distribution of the time-series data is the leptokurtic distribution, wherein the Hampel filter is used to minimize masking effect when detecting the sensitivity content by choosing a higher value for a breakdown point, and wherein the median-based Rosner filter is used, when a number of outliers is unknown, to provide optimal swamping breakdown point; identifying, by the processor, a density of the detected sensitivity content, wherein the density of the detected sensitivity content indicates presence of the anomaly in at least two successive instances of the one or more instances, wherein the Hampel filter and the median-based Rosner filter minimizes false positive and false negative alarm rates while detecting the sensitivity content in the time-series data. 2. The method of claim 1 , further comprising alerting a user indicating the density identification when the at least two successive instances comprise the sensitivity content. 3. The method of claim 1 , wherein the time-series data is processed using the Hampel filter by calculating a median and a median absolute deviation of the time-series data in order to reduce masking effect, wherein the higher breakdown point improves the accuracy of detecting the sensitivity content. 4. The method of claim 1 , wherein the time-series data is processed using the median-based Rosner filter by approximating the time-series data to a student-t distribution and minimizing an error in detection of the sensitivity content while achieving optimal swamping effect. 5. A system for detecting sensitivity content in time-series data, the system implemented on a server comprising: a processor; a memory coupled to the processor, wherein the processor executes a plurality of modules stored in the memory, and wherein the plurality of modules comprising: a reception module to receive the time-series data from a source, wherein the data is received for one or more instances, and wherein an instance of the one or more instances is associated with a value of the time-series data, and wherein the source comprises one or more sensors, wherein the one or more sensors measure the value of the time-series data and convert the value into one or more signals, wherein the one or more sensors include one or more wireless sensors to transmit the time-series data to the server in real-time; a detection module to detect the sensitivity content in the time-series data, wherein the sensitivity content indicates presence of an anomaly and the sensitivity content is defined as a minute statistical anomaly indicative of existence of private information in the time-series data, wherein the sensitivity content is detected for a value corresponding to each instance from the one or more instances in the time-series data, wherein the detection further comprising: determining a kurtosis value corresponding to each instance of the one or more instances associated with the value of the time-series data in a data distribution of the time-series data, wherein the value of the time-series data includes a plurality of time stamps associated with the one or more instances; comparing the kurtosis value with a reference value; determining a data distribution of the time-series data based upon the comparison, wherein the data distribution is one of a platykurtic distribution when the kurtosis value is less than the reference value, a mesokurtic distribution when the kurtosis value is equal to the reference value, and a leptokurtic distribution when the kurtosis value is greater than the reference value; and processing the time-series data using a Hampel filter and a median-based Rosner filter, wherein the Hampel filter is used when the data distribution of the time-series data is either of the platykurtic distribution or the mesokurtic distribution, and wherein the median-based Rosner filter is used when the data distribution of the time-series data is the leptokurtic distribution, wherein the Hampel filter is used to minimize masking effect when detecting the sensitivity content by choosing a higher value for a breakdown point, and wherein the median-based Rosner filter is used, when a number of outliers is unknown, to provide optimal swamping breakdown point; and identifying a density of the detected sensitivity content, wherein the density of the detected sensitivity content indicates presence of the anomaly in at least two successive instances of the one or more instances, wherein the Hampel filter and the median-based Rosner filter minimizes false positive and false negative alarm rates while detecting the sensitivity content in the time-series data. 6. The system of claim 5 , wherein the detection module further alerts a user indicating the density identification when the at least two successive instances comprise the sensitivity content. 7. The system of claim 5 , wherein the detection module processes the time-series data using the Hampel filter by calculating a median and a median absolute deviation of the time-series data in order to reduce masking effect, wherein the higher breakdown point improves the accuracy of detecting the sensitivity content. 8. The system of claim 5 , wherein detection module processes the time-series data using the median-based Rosner filter by approximating the time-series data to a student-t distribution and minimizing an error in detection of the sensitivity content while achieving optimal swamping effect. 9. A non-transitory computer readable medium embodying a program executable in a server for detecting sensitivity content in time-series data, the progra

Assignees

Inventors

Classifications

  • for providing a confidential data exchange among entities communicating through data packet networks · CPC title

  • Masking or blinding · CPC title

  • Anonymization, e.g. involving pseudonyms · CPC title

  • Protecting personal data, e.g. for financial or medical purposes · CPC title

  • Remote reading of utility meters · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10268836B2 cover?
A system and method for detecting sensitivity content in time-series data is disclosed. The method comprises receiving the time-series data from a source. The data is received for one or more instances. The method further comprises detecting the sensitivity content in the time-series data. The sensitivity content indicates presence of an anomaly. The detecting comprises determining a kurtosis v…
Who is the assignee on this patent?
Tata Consultancy Services Ltd
What technology area does this patent fall under?
Primary CPC classification G06F21/6245. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 23 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).