What technology area does this patent fall under?

Primary CPC classification G06F16/2272. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Dec 09 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Index sharding

US12493601B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12493601-B2
Application number	US-202217722754-A
Country	US
Kind code	B2
Filing date	Apr 18, 2022
Priority date	Jan 31, 2019
Publication date	Dec 9, 2025
Grant date	Dec 9, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Indexing in a low-latency data access and analysis system includes accessing, by an indexing unit of a low-latency data access and analysis system, constituent data from a data source of the low-latency data access and analysis system and indexing the constituent data in an index of the low-latency data access and analysis system by an indexing unit of the low-latency data access and analysis system. Indexing includes partitioning the constituent data based on a characteristic of the constituent data into at least a first partition and a second partition, segmenting the first partition into a first segment of the first partition, sharding the first segment into a first shard of the first segment of the first partition, segmenting, using hash-partitioning, the second partition into one or more segments of the second partition, and for respective segments of the second partition, sharding the respective segment into one or more respective shards.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method comprising: accessing, by an indexing unit of a low-latency data access and analysis system, constituent data from a data source of the low-latency data access and analysis system; and indexing the constituent data in an index of the low-latency data access and analysis system by the indexing unit of the low-latency data access and analysis system, wherein indexing includes: partitioning the constituent data based on a characteristic of the constituent data into at least a first partition and a second partition; segmenting the first partition into a first segment of the first partition; sharding the first segment into a first shard of the first segment of the first partition; segmenting, using hash partitioning, the second partition into one or more segments of the second partition; and for respective segments of the second partition, sharding the respective segments into one or more respective shards. 2 . The method of claim 1 , wherein indexing includes: sending, from the indexing unit to the data source, a request to pin a portion of a database of the data source corresponding to the constituent data; in response to receiving, by the indexing unit, an indication that the portion of the database is pinned, sending, from the indexing unit to the data source, a sampling data request indicating a sampling data-query for the portion of the database; and accessing, by the indexing unit, sampling results responsive to the sampling data-query. 3 . The method of claim 2 , wherein the constituent data includes a plurality of logical tables. 4 . The method of claim 3 , wherein partitioning includes: identifying a smallest unpartitioned table from the plurality of logical tables; determining that a current size of the first partition is less than a defined maximum size for the first partition, and in response to determining that the current size of the first partition is less than the defined maximum size for the first partition: identifying a sum of the current size of the first partition and a size of the smallest unpartitioned table as the current size of the first partition; and assigning the smallest unpartitioned table to the first partition; determining that the current size of the first partition is at least the defined maximum size for the first partition, and in response to determining that the current size of the first partition is at least the defined maximum size for the first partition, assigning the smallest unpartitioned table to the second partition; and identifying the smallest unpartitioned table as a partitioned table. 5 . The method of claim 3 , wherein segmenting, using hash partitioning, the second partition includes: identifying, as a cardinality of the one or more segments of the second partition, a lesser of a defined maximum cardinality of segments of the second partition or a quotient of dividing a sum of sizes of tables from the plurality of logical tables assigned to the second partition by a defined maximum segment size. 6 . The method of claim 2 , wherein, for a respective segment, sharding includes: identifying, by a segment manager of the indexing unit, an indexing mode for indexing an object from the respective segment based on the sampling results; generating, by the segment manager, a shard specification for generating a shard of the respective segment based on the sampling results and the indexing mode; sending, from the indexing unit to the data source, a constituent data request indicating a constituent data-query for the respective segment; generating a shard assignment indicating the shard specification and an indexing operation unit; and generating, by the indexing operation unit, the shard based on the shard assignment, wherein generating the shard includes accessing the constituent data responsive to the constituent data request. 7 . The method of claim 1 , further comprising: receiving data expressing usage intent with respect to the constituent data; in response to receiving the data expressing usage intent, generating response data responsive to the data expressing usage intent, wherein generating the response data includes resolving at least a portion of the data expressing usage intent by traversing the index, wherein traversing the index includes traversing a shard from the index to identify a token corresponding to a portion of the data expressing usage intent; and outputting the response data. 8 . The method of claim 1 , wherein the characteristic is table size. 9 . The method of claim 1 , wherein indexing includes: obtaining, by the low-latency data access and analysis system, index configuration data for indexing the constituent data, wherein the index configuration data includes at least one of token type information, data source information, or index distribution coordination information. 10 . A low-latency data access and analysis system comprising: a non-transitory computer-readable storage medium that stores instructions for operating the low-latency data access and analysis system; and a processor that executes the instructions to operate an indexing unit to index constituent data in an index of the low-latency data access and analysis system, wherein, to index the constituent data, the processor executes the instructions to: access, by the indexing unit, the constituent data from a data source; partition the constituent data based on a characteristic of the constituent data into at least a first partition and a second partition; segment the first partition into a first segment of the first partition; shard the first segment into a first shard of the first segment of the first partition; segment, using hash partitioning, the second partition into one or more segments of the second partition; and for respective segments of the second partition, shard the respective segments into one or more respective shards. 11 . The low-latency data access and analysis system of claim 10 , wherein, to index the constituent data, the processor executes the instructions to: send, from the indexing unit to the data source, a request to pin a portion of a database of the data source corresponding to the constituent data; receive, by the indexing unit, an indication that the portion of the database is pinned; in response to the indication that the portion of the database is pinned, send, from the indexing unit to the data source, a sampling data request indicating a sampling data-query for the portion of the database; and access, by the indexing unit, sampling results responsive to the sampling data-query. 12 . The low-latency data access and analysis system of claim 11 , wherein the constituent data includes a plurality of logical tables. 13 . The low-latency data access and analysis system of claim 12 , wherein, to partition the constituent data, the processor executes the instructions to: identify a smallest unpartitioned table from the plurality of logical tables; in response to a determination that a current size of the first partition is less than a defined maximum size for the first partition: identify a sum of the current size of the first partition and a size of the smallest unpartitioned table as the current size of the first partition; and assign the smallest unpartitioned table to the first partition; in response to a determination that the current size of the first partition is at least the defined maximum size for the first partition, assign the smallest unpartitioned table to the second partition; and identify the smallest unpartitioned table as a partitioned table.

Assignees

Thoughtspot Inc

Inventors

Classifications

G06F16/278
Data partitioning, e.g. horizontal or vertical partitioning · CPC title
G06F16/2255
Hash tables · CPC title
G06F16/2272Primary
Management thereof · CPC title

Patent family

Related publications grouped by family.

View patent family 71836443

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12493601B2 cover?: Indexing in a low-latency data access and analysis system includes accessing, by an indexing unit of a low-latency data access and analysis system, constituent data from a data source of the low-latency data access and analysis system and indexing the constituent data in an index of the low-latency data access and analysis system by an indexing unit of the low-latency data access and analysis s…
Who is the assignee on this patent?: Thoughtspot Inc
What technology area does this patent fall under?: Primary CPC classification G06F16/2272. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Dec 09 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).