Method and apparatus for data processing method
US-2016078027-A1 · Mar 17, 2016 · US
US10288653B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10288653-B2 |
| Application number | US-201414538999-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 12, 2014 |
| Priority date | Mar 6, 2014 |
| Publication date | May 14, 2019 |
| Grant date | May 14, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for identifying frequently occurring waveform patterns in time series comprises segmenting each of one or more time series into a plurality of subsequences. Further, a subsequence matrix comprising each of the plurality of subsequences is generated. Further, the subsequence matrix is processed to obtain a candidate subsequence matrix comprising a plurality of non-trivial subsequences. Further, the plurality of non-trivial subsequences is clustered into a plurality of spherical clusters of a predetermined diameter. Further, a plurality of sub-clusters for each of one or more spherical clusters is obtained based on a mean of each of the plurality of non-trivial subsequences present in the spherical cluster. Further, one or more frequent waveform clusters, depicting frequently occurring waveform patterns, are ascertained from amongst the one or more spherical clusters based on a number of non-trivial subsequences present in each of the plurality of sub-clusters of the spherical cluster.
Opening claim text (preview).
We claim: 1. A computer-implemented method for identifying frequently occurring waveform patterns in time series, the method comprising: segmenting each of one or more time series into a plurality of subsequences, wherein each of the plurality of subsequences is a segment of a predetermined length pertaining to time series data corresponding to the time series; obtaining a candidate subsequence matrix, comprising a plurality of non-trivial subsequences, wherein each of the non-trivial subsequence is a subsequence having unique symbolic aggregate approximation (SAX) symbolic representation in comparison to adjacent non-trivial subsequences; processing the candidate subsequence matrix for filtering and normalizing the plurality of non-trivial subsequences included in the subsequence matrix; clustering the plurality of normalized non-trivial subsequences into a plurality of spherical clusters, wherein each of the plurality of spherical clusters is of a predetermined diameter based on a fixed radius, wherein clustering the plurality of normalized non-trivial subsequences comprises the steps of: ascertaining, for each of the plurality of non-trivial subsequences, a candidate cluster set based on a clustering technique, wherein the candidate cluster set comprises one or more spherical clusters in vicinity of the non-trivial subsequence and wherein the clustering technique is one of a binary search tree (BST) acceleration, a local-sensitive hashing (LSH) acceleration, and a balanced iterative reducing and clustering using hierarchies (BIRCH) acceleration; computing a cluster distance for each of the one or more spherical clusters pertaining to the candidate cluster set, wherein the cluster distance is a distance between the non-trivial subsequence and a centroid of the spherical cluster; identifying a vicinity spherical cluster, from among the one or more spherical clusters, wherein the vicinity spherical cluster is a cluster having lowest value of cluster distance; comparing the cluster distance of the vicinity spherical cluster with a predetermined threshold distance, wherein the predetermined threshold distance is equal to the predetermined radius; and including the non-trivial subsequence in the vicinity spherical cluster based on the comparison; obtaining a plurality of sub-clusters for each of one or more spherical clusters, from among the plurality of spherical clusters, based on a mean of each of the plurality of non-trivial subsequences present in the spherical cluster, wherein each of the plurality of sub-clusters includes one or more non-trivial subsequences from amongst the plurality of non-trivial subsequences; and ascertaining, one or more frequent waveform clusters, from each of the one or more spherical clusters based on a number of non-trivial subsequences present in each of the plurality of sub-clusters of the spherical cluster, wherein a frequent waveform cluster has a number of non-trivial subsequences greater than a predefined threshold number, and wherein the non-trivial subsequences depict frequently occurring waveform patterns. 2. The method as claimed in claim 1 , wherein the obtaining further comprises: generating a subsequence matrix based on the segmenting, wherein the subsequence matrix comprises each of the plurality of subsequences; and processing the subsequence matrix to obtain the candidate subsequence matrix. 3. The method as claimed in claim 2 , wherein the processing further comprises: filtering the plurality of subsequences using an indicator function for obtaining a plurality of filtered subsequences, wherein each of the plurality of filtered subsequences satisfies a predetermined filtering condition depicted by the indicator function; reducing the predetermined length of each of the plurality of filtered subsequences to a reduced length using piecewise averaging of the filtered subsequences; computing a local mean for each of the plurality of filtered subsequences of reduced length; and subtracting, for each of the plurality of filtered subsequences of reduced length, the local mean from the filtered subsequence to obtain a plurality of normalized subsequences, wherein a normalized subsequence is a subsequence having a zero mean; discretizing, for each of the plurality of normalized subsequences, each element pertaining to the normalized subsequence into a symbol using the SAX representation for obtaining a symbolic depiction of the normalized subsequence; and identifying, from among the discretized plurality of normalized subsequences, at least one non-trivial subsequence, wherein the non-trivial subsequence is a subsequence whose symbolic depiction does not match with a symbolic depiction of an adjacent subsequence. 4. The method as claimed in claim 1 , wherein the clustering further comprises: ascertaining, for the cluster distance of the vicinity spherical cluster being less than the predetermined threshold distance, the vicinity spherical cluster as a target spherical cluster; and including the non-trivial subsequence in the target spherical cluster. 5. The method as claimed in claim 1 , wherein the clustering further comprises: creating, for the cluster distance of the vicinity spherical cluster being more than the predetermined threshold, a new cluster; and including the non-trivial subsequence in the new cluster. 6. The method as claimed in claim 1 , wherein the candidate cluster set is ascertained based on at least one of a binary search tree (BST) acceleration technique, a locally sensitive hashing (LSH) acceleration technique, and a BIRCH acceleration technique. 7. The method as claimed in claim 1 , wherein each of the plurality of non-trivial subsequences present in the spherical cluster are clustered into sub-clusters using one-dimensional clustering techniques. 8. The method as claimed in claim 1 , wherein the method further comprises: identifying, from amongst the plurality of spherical clusters, one or more high support spherical clusters, wherein each of the one or more high support spherical clusters is a spherical cluster having a number of subsequences greater than the predefined threshold number; identifying one or more pairs of shifted spherical clusters from among the identified high support clusters, wherein a predetermined percentage of non-trivial subsequences of a first high support spherical cluster present in a pair of shifted spherical clusters is similar to the non-trivial subsequences of a second high support spherical cluster present in the pair of shifted spherical cluster, and wherein the number of non-trivial subsequence in the first high support cluster is less than the number of non-trivial subsequences in the second high support cluster; and obtaining the second high support clusters from the one or more pairs of shifted clusters; and processing each of the high support clusters to remove trivial match subsequences present in the high support cluster for obtaining the one or more spherical clusters pertaining to the plurality of spherical clusters. 9. The method as claimed in claim 1 , wherein the method further comprises: computing a rank for each of the one or more frequent waveform clusters using a ranking function, wherein the ranking function is based on at least one of a deviation of the frequent waveform cluster, a level of the frequent waveform cluster, and a non-trivial match count pertaining to the frequent waveform cluster, wherein the deviation is a an average deviation of the non-trivial subsequences pertaining to the frequent waveform cluster, the level of the frequent waveform cluster is an average of the means of each of the non-trivial subsequence pertaining to the frequent waveform cluster, and the non-trivial match coun
Circuits for comparing several input signals and for indicating the result of this comparison, e.g. equal, different, greater, smaller (comparing phase or frequency of 2 mutually independent oscillations in demodulators) · CPC title
Classification; Matching · CPC title
Clustering techniques · CPC title
Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods · CPC title
Determining representative reference patterns, e.g. by averaging or distorting; Generating dictionaries · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.