Updating inverted indices

US10073874B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-10073874-B1
Application numberUS-201314086536-A
CountryUS
Kind codeB1
Filing dateNov 21, 2013
Priority dateMay 28, 2013
Publication dateSep 11, 2018
Grant dateSep 11, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Implementations provide an indexing system with an instant failover that uses a moving snapshot window. For example, a method may include receiving, by a processor, a query and determining that a main query processing engine is not responding. The method may further include generating a search result for the query using a secondary query processing engine that applies at least one snapshot record to a portion of a posting list, the snapshot record including the portion of the posting list as it appeared before a modification, and the modification occurring within a predetermined time before receiving the query. The portion is a fixed size smaller than the posting list. Applying the snapshot record can include overlaying the portion of the posting list with the snapshot record beginning at an offset specified by the snapshot record. The main query processing engine generates a search result without applying snapshot records.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: memory storing an index of documents, the index comprising at least one posting list that maps a term to documents; at least one processor; and memory storing instructions that, when executed by the at least one processor cause the system to: receive an update for the posting list, responsive to receiving the update: generate a snapshot record, the snapshot record including a portion of the posting list that is to be changed by the update, the snapshot record being smaller than the entire posting list, and modify the portion of the posting list using the update, use a main query processing engine to generate a search result for a subsequent query based on the updated posting list without applying the snapshot record, and responsive to determining that the main query processing engine is unavailable, use a secondary query processing engine to generate a search result for a subsequent query by applying the snapshot record to the updated posting list. 2. The system of claim 1 , wherein the system further includes memory storing instructions that, when executed by the at least one processor, cause the system to: delay updating the posting list for a predetermined amount of time. 3. The system of claim 2 , wherein delaying includes: placing the update in an update pending queue; and modifying the portion of the posting list when the predetermined amount of time has elapsed. 4. The system of claim 1 , wherein the portion fits within a cache line of the processor. 5. The system of claim 1 , wherein the snapshot record includes an associated timestamp. 6. The system of claim 1 , wherein the secondary query processing engine overlays the portion of the posting list with the snapshot record. 7. The system of claim 5 , wherein the secondary query processing engine overlays the portion of the posting list with the snapshot record when the timestamp of the snapshot record indicates the snapshot is an oldest snapshot record for the portion within a predetermined amount of time. 8. The system of claim 1 , wherein the system further includes memory storing instructions that, when executed by the at least one processor, further cause the system to restore the main query processing engine by overlaying the portion of the posting list with the snapshot record. 9. The system of claim 5 , wherein the update is a first update and wherein the system further includes memory storing instructions that, when executed by the at least one processor, cause the system to: receive a second update for the posting list, the second update including a second update to the portion of the posting list; generate a second snapshot record, the second snapshot record including the portion that was modified by the first update; and update the posting list with the second update. 10. The system of claim 1 , wherein the update is a first update and the portion is a first portion and wherein the system further has memory storing instructions that, when executed by the at least one processor, cause the system to: receive a second update for the posting list, the second update including a change to a second portion of the posting list and a change to the first portion of the posting list; generate a second snapshot record, the second snapshot record including the portion that was modified by the first update; generate a third snapshot record, the third snapshot record including the second portion of the posting list that is to be changed by the second update, wherein the second snapshot record and the third snapshot record have a same associated timestamp; and modify the second portion of the posting list with the second update. 11. The system of claim 1 , wherein the snapshot record uses a lock-free memory sharing structure. 12. The system of claim 11 , wherein the lock-free memory sharing structure includes a state field that indicates whether the snapshot record is invalid, valid and pinned, or valid and not pinned. 13. A computer-implemented method comprising: receiving, by a processor, a query; determining that a main query processing engine is not responding; and generating a search result for the query using a secondary query processing engine that applies at least one snapshot record to a portion of a posting list to rebuild an older version of the posting list, the snapshot record including the portion of the posting list as it appeared before a modification to the portion, the modification occurring within a predetermined time before receiving the query, the portion being a fixed size smaller than the posting list. 14. The method of claim 13 , wherein the predetermined time represents a time elapsed in processing at least a predetermined number of queries. 15. The method of claim 13 , wherein the at least one snapshot record is an oldest snapshot record selected from a plurality of snapshot records for the portion of the posting list, the plurality of snapshot records having timestamps within the predetermined time. 16. The method of claim 13 , wherein applying the snapshot record includes overlaying the portion of the posting list with the snapshot record beginning at an offset specified by the snapshot record. 17. The method of claim 13 , further comprising: replacing the portion of the posting list with the snapshot record beginning at an offset specified by the snapshot record to recover the posting list; and restarting the main query processing engine. 18. The method of claim 13 , further comprising: receiving an update for the posting list; determining a second portion of the posting list affected by the update; generating a new snapshot record for the second portion of the posting list, the new snapshot record including information in the second portion of the posting list; and modifying the second portion of the posting list with the update. 19. The method of claim 18 , further comprising: placing the update in an update pending queue; and modifying the second portion of the posting list when a second predetermined amount of time has elapsed. 20. The method of claim 13 , wherein the main query processing engine generates search results without applying snapshot records. 21. The method of claim 13 , wherein the snapshot record includes a lock-free memory sharing structure. 22. A system comprising: memory storing an index of documents, the index including at least one posting list that maps a term to documents; memory storing snapshot records, the snapshot records having at least one record that includes: a map portion that includes a state field, and a data portion that includes a portion of the at least one posting list as it existed prior to an update; at least one processor; and memory storing instructions that, when executed by the at least one processor cause the system to: receive a query that includes the term; determine that a main query processing engine is unavailable; and generate a search result for the query using a secondary query processing engine that overlays a portion of the at least one posting list with the data portion of the at least one snapshot record so that the secondary query processing engine rebuilds an older version of the posting list. 23. The system of claim 22 , wherein the system further includes memory storing instructions that, when executed by the at least one processor, cause the system to: receive an update for the at least one posting list;

Assignees

Inventors

Classifications

  • Physics · mapped topic

  • Physics · mapped topic

  • G06F16/31Primary

    Indexing; Data structures therefor; Storage structures · CPC title

  • Saving, restoring, recovering or retrying · CPC title

  • Indexing structures · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10073874B1 cover?
Implementations provide an indexing system with an instant failover that uses a moving snapshot window. For example, a method may include receiving, by a processor, a query and determining that a main query processing engine is not responding. The method may further include generating a search result for the query using a secondary query processing engine that applies at least one snapshot reco…
Who is the assignee on this patent?
Google Inc, Google Llc
What technology area does this patent fall under?
Primary CPC classification G06F17/30321. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 11 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).