Who is the assignee on this patent?

Fan Wei, Perng Chang-Shing, Wang Haixun, and 2 more

What technology area does this patent fall under?

Primary CPC classification G06F16/2246. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Apr 14 2015 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

System and method for indexing weighted-sequences in large databases

US9009176B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9009176-B2
Application number	US-19871708-A
Country	US
Kind code	B2
Filing date	Aug 26, 2008
Priority date	Nov 26, 2003
Publication date	Apr 14, 2015
Grant date	Apr 14, 2015

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present invention provides an index structure for managing weighted-sequences in large databases. A weighted-sequence is defined as a two-dimensional structure in which each element in the sequence is associated with a weight. A series of network events, for instance, is a weighted-sequence because each event is associated with a timestamp. Querying a large sequence database by events' occurrence patterns is a first step towards understanding the temporal causal relationships among the events. The index structure proposed herein enables the efficient retrieval from the database of all subsequences (contiguous and non-contiguous) that match a given query sequence both by events and by weights. The index structure also takes into consideration the nonuniform frequency distribution of events in the sequence data.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of matching a query sequence in a weighted-sequences index, comprising: transforming a query sequence comprising symbols into a weighted query sequence comprising a sequence of pairs, wherein each pair includes a corresponding one of the symbols and a corresponding weight, and wherein the corresponding weight is represented by a real number; reordering the weighted query sequence such that the pairs having symbols that occur less frequently in the weighted query sequence are ordered earlier than the pairs having symbols that occur more frequently; encoding the re-ordered pairs of the weighted query sequence into a one-dimensional sequence comprising encoded symbols, wherein each encoded symbol includes a symbol of a corresponding first one of the re-ordered pairs and a weight parameter that is a difference between the weight of the symbol and a weight of a second one of the re-ordered pairs that is adjacent the first re-ordered pair; and searching an iso-depth link of the weighted-sequences index using the one-dimensional sequence to return an offset. 2. The method of claim 1 , wherein the reordering comprises: assigning a numerical rank to each of the symbols based on how frequently they occur; multiplying each numerical rank by a window size to generate values for each of the symbols; adding the generated values to the corresponding weights to generate modified weights; and reordering the symbols in the weighted query sequence according to the modified weights. 3. The method of claim 2 , wherein the generated values are further multiplied by 2 before they are added to the corresponding weights. 4. The method of claim 1 , wherein the weights are time stamps. 5. The method of claim 1 , wherein the weighted-sequences index is an iso-depth index. 6. The method of claim 1 , wherein the weighted-sequences index comprises a tree where each pair is a trie node of the tree. 7. The method of claim 6 , wherein each trie node includes a sequential ID and the current sequential ID of a descendant trie node is between the sequential id of a parent trie node and a predefined maximum sequential ID. 8. The method of claim 7 , wherein the trie nodes in the iso-depth link are sorted by their IDs in ascending order. 9. The method of claim 6 , wherein the iso-depth link is a horizontal link in the tree comprising trie nodes that are a same distance from a root of the tree. 10. A non-transitory machine-readable medium having instructions stored thereon for execution by a processor to perform a method of matching a query sequence in an iso-depth index, comprising: reordering a weighted query sequence comprising pairs having symbols and corresponding weights, represented by a real number; such that the pairs having symbols that occur less frequently in the weighted query sequence are ordered earlier than the pairs having symbols that occur more frequently; encoding the re-ordered pairs of the weighted query sequence into a one-dimensional sequence comprising encoded symbols, wherein each encoded symbol includes a symbol of a corresponding first one of the re-ordered pairs and a weight parameter that is a difference between a weight of a second one of the re-ordered pairs that is adjacent the first re-ordered pair; and searching an iso-depth link of the iso-depth index using the one-dimensional sequence to return an offset. 11. The machine-readable medium of claim 10 , wherein the reordering comprises: assigning a numerical rank to each of the symbols based on how frequently they occur; multiplying each numerical rank by a 2 times a window size to generate values for each of the symbols; adding the generated values to the corresponding weights to generate modified weights; and reordering the symbols in the weighted query sequence according to the modified weights. 12. The machine-readable medium of claim 10 , wherein the weights are time stamps. 13. The machine-readable medium of claim 10 , wherein the iso-depth index comprises a plurality of pairs of symbols and weights ordered by how frequently each of the symbols occur. 14. The machine-readable medium of claim 13 , wherein the iso-depth index comprises a tree where each pair is a trie node of the tree. 15. The machine-readable medium of claim 14 , wherein each trie node includes a sequential ID and the current sequential ID of a descendant tie node is between the sequential id of a parent trie node and a predefined maximum sequential ID. 16. The machine-readable medium of claim 14 , wherein the trie nodes in the iso-depth link are sorted by their IDs in ascending order. 17. The machine-readable medium of claim 14 , wherein the iso-depth link is a horizontal link in the tree comprising trie nodes that are a same distance from a root of the tree. 18. A method of matching a query sequence in a weighted-sequences index, comprising: transforming a query sequence comprising symbols into a weighted query sequence comprising a sequence of pairs, wherein each pair includes a corresponding one of the symbols and a corresponding weight, and wherein the corresponding weight is represented by a real number; reordering the weighted query sequence such that the pairs having symbols that occur less frequently are ordered earlier than the pairs having symbols that occur more frequently; encoding the re-ordered pairs of the weighted query sequence into a one-dimensional sequence comprising encoded symbols, wherein each encoded symbol includes a symbol of a corresponding one of the re-ordered pairs and a weight parameter that is a difference between a weight of the symbol and a weight of an adjacent symbol, wherein each weight parameter is a subscript of the symbol of the corresponding encoded symbol; and searching an iso-depth link of the weighted-sequences index using the one-dimensional sequence to return an offset. 19. The method of claim 1 , wherein the weight parameter is a subscript of the symbol of the corresponding encoded symbol.

Assignees

Inventors

Classifications

Y10S707/99943
Generating database or data structure, e.g. via user interface · CPC title
G06F16/2474
Sequence data queries, e.g. querying versioned data · CPC title
G06F16/2246Primary
Trees, e.g. B+trees · CPC title
G06F17/30327Primary
Physics · mapped topic
G06F17/30548
Physics · mapped topic

Patent family

Related publications grouped by family.

View patent family 34592205

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9009176B2 cover?: The present invention provides an index structure for managing weighted-sequences in large databases. A weighted-sequence is defined as a two-dimensional structure in which each element in the sequence is associated with a weight. A series of network events, for instance, is a weighted-sequence because each event is associated with a timestamp. Querying a large sequence database by events' occu…
Who is the assignee on this patent?: Fan Wei, Perng Chang-Shing, Wang Haixun, and 2 more
What technology area does this patent fall under?: Primary CPC classification G06F16/2246. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Apr 14 2015 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).