Metadata search via N-gram index

US12007997B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12007997-B2
Application numberUS-202318183483-A
CountryUS
Kind codeB2
Filing dateMar 14, 2023
Priority dateOct 29, 2021
Publication dateJun 11, 2024
Grant dateJun 11, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

As described herein, a N-Gram index may be created and the search may be conducted using the index, which will lead to faster search results. The N-Gram index may also include partial N-Gram components to capture more relevant data. Moreover, as described herein, the search may also take into account recent log data that has not yet been indexed. Techniques for building an index store using log data and efficiently searching the index store and log data to process search requests are described herein.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: storing a log associated with an operation performed on a source table; creating or updating an index record for a N-Gram index associated with the source table to generate a new version of the N-Gram index based on the log using a background process; receiving, from a user, a search request including a search string; and in response to the search request, processing the search request by performing steps including: loading, in a log cache, log data associated with new logs not yet reflected in the N-Gram index; retrieving index records matching the search string; storing the matched index records in an index cache; merging the log data in the log cache with the matched index records in the index cache to generate merged data; and generating results of the search request based on the merged data, wherein processing the search request is a separate process than the background process used for updating the new logs in the N-gram index. 2. The method of claim 1 , further comprising: updating metadata associated with the N-Gram index to facilitate searching of the N-Gram index based on the index record; in response to the search request, loading index statistics of the N-Gram index; retrieving metadata associated with the N-Gram index; and performing post filtering of the merged log data and matched index records. 3. The method of claim 1 , further comprising: performing a permission check on the merged data to determine that the user has access rights to the merged data. 4. The method of claim 1 , wherein the operation includes a delete operation, the method further comprising: generating an index deletion log record associated with the delete operation; marking an index record in the N-Gram index associated with the delete operation; and based on at least one search associated with an older version of the N-Gram index being completed, deleting the marked index record based on the index deletion log record. 5. The method of claim 4 , wherein the marked index record is not deleted in a same transaction as the delete operation. 6. The method of claim 1 , wherein N-Gram index includes partial N-grams for at least last N characters of a name. 7. The method of claim 1 , wherein index records for the N-Gram index include a prefix portion indicating whether a corresponding index record is for a prefix substring. 8. The method of claim 1 , wherein the log includes an account ID, a chunk ID, a timestamp, a domain ID, and an entity ID. 9. The method of claim 1 , wherein each index record of the N-Gram index includes an account ID, a prefix portion, an index version, a domain ID, and a parent entity ID. 10. A system comprising: at least one hardware processor; and at least one memory storing instructions that, when executed by the at least one hardware processor, cause the at least one hardware processor to perform operations comprising: storing a log associated with an operation performed on a source table; creating or updating an index record for a N-Gram index associated with the source table to generate a new version of the N-Gram index based on the log using a background process; receiving, from a user, a search request including a search string; and in response to the search request, processing the search request by performing steps including: loading, in a log cache, log data associated with new logs not yet reflected in the N-Gram index; retrieving index records matching the search string; storing the matched index records in an index cache; merging the log data in the log cache with the matched index records in the index cache to generate merged data; and generating results of the search request based on the merged data, wherein processing the search request is a separate process than the background process used for updating the new logs in the N-gram index. 11. The system of claim 10 , the operations further comprising: updating metadata associated with the N-Gram index to facilitate searching of the N-Gram index based on the index record; in response to the search request, loading index statistics of the N-Gram index; retrieving metadata associated with the N-Gram index; and performing post filtering of the merged log data and matched index records. 12. The system of claim 10 , the operations further comprising: performing a permission check on the merged data to determine that the user has access rights to the merged data. 13. The system of claim 10 , wherein the operation includes a delete operation, the operations further comprising: generating an index deletion log record associated with the delete operation; marking an index record in the N-Gram index associated with the delete operation; and based on at least one search associated with an older version of the N-Gram index being completed, deleting the marked index record based on the index deletion log record. 14. The system of claim 13 , wherein the marked index record is not deleted in a same transaction as the delete operation. 15. The system of claim 10 , wherein N-Gram index includes partial N-grams for at least last N characters of a name. 16. The system of claim 10 , wherein index records for the N-Gram index include a prefix portion indicating whether a corresponding index record is for a prefix substring. 17. The system of claim 10 , wherein the log includes an account ID, a chunk ID, a timestamp, a domain ID, and an entity ID. 18. The system of claim 10 , wherein each index record of the N-Gram index includes an account ID, a prefix portion, an index version, a domain ID, and a parent entity ID. 19. A machine-storage medium embodying instructions that, when executed by a machine, cause the machine to perform operations comprising: storing a log associated with an operation performed on a source table; creating or updating an index record for a N-Gram index associated with the source table to generate a new version of the N-Gram index based on the log using a background process; receiving, from a user, a search request including a search string; and in response to the search request, processing the search request by performing steps including: loading, in a log cache, log data associated with new logs not yet reflected in the N-Gram index; retrieving index records matching the search string; storing the matched index records in an index cache; merging the log data in the log cache with the matched index records in the index cache to generate merged data; and generating results of the search request based on the merged data, wherein processing the search request is a separate process than the background process used for updating the new logs in the N-gram index. 20. The machine-storage medium of claim 19 , the operations further comprising: updating metadata associated with the N-Gram index to facilitate searching of the N-Gram index based on the index record; in response to the search request, loading index statistics of the N-Gram index; retrieving metadata associated with the N-Gram index; and performing post filtering of the merged log data and matched index records. 21. The machine-storage medium of claim 19 , the operations further comprising: performing a permission check on the merged data to determine that the user has access rights to the merged data. 22. The machine-storage medium of claim 19 , wherein the operation includes a delete operation, further comprising: generating a

Assignees

Inventors

Classifications

  • to a system of files or objects, e.g. local or distributed file system or database · CPC title

  • Merging, i.e. combining at least two sets of record carriers each arranged in the same ordered sequence to produce a single set having the same ordered sequence · CPC title

  • where protection concerns the structure of data, e.g. records, types, queries · CPC title

  • Updates performed during online database operations; commit processing · CPC title

  • Indexing structures · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12007997B2 cover?
As described herein, a N-Gram index may be created and the search may be conducted using the index, which will lead to faster search results. The N-Gram index may also include partial N-Gram components to capture more relevant data. Moreover, as described herein, the search may also take into account recent log data that has not yet been indexed. Techniques for building an index store using log…
Who is the assignee on this patent?
Snowflake Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/2228. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 11 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).