Recyclable private memory heaps for dynamic search indexes

US11030262B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11030262-B2
Application numberUS-201514835522-A
CountryUS
Kind codeB2
Filing dateAug 25, 2015
Priority dateAug 25, 2015
Publication dateJun 8, 2021
Grant dateJun 8, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In one embodiment, a search engine may generate and store a plurality of search index segments such that each of the search index segments is stored in a corresponding one of a plurality of heaps of memory. The plurality of search index segments may include inverted index segments mapping content to documents containing the content. A garbage collection module may release one or more heaps of the memory.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: generating and storing in memory, by a search engine, a first search index segment of a search index, the first search index segment being stored in a first heap of the memory, the first search index segment including one or more first inverted index segments mapping content to a first set of documents, the search index being an inverted index; generating and storing in the memory, by the search engine, a second search index segment of the search index, the second search index segment being stored in a second heap of the memory concurrent with the first heap of the memory, the second search index segment including one or more second inverted index segments mapping content to a second set of documents, the second set of documents being more recently processed by the search engine than the first set of documents; updating the second search index segment until a threshold corresponding to a number of documents or an amount of memory has been reached; responsive to determining that the threshold corresponding to the number of documents or the amount of memory has been reached: cutting off a current generation of documents associated with the second heap of the memory by changing a status of the second search index segment being stored in the second heap of the memory from a read/write status to a read-only status; and initiating a next generation of documents by generating a third search index segment of the search index, wherein the third search index segment is assigned the read/write status and is stored in a new heap of the memory associated with the next generation of documents; and releasing, by a garbage collection module, a third heap of the memory, the third heap of the memory storing a fourth search index segment of the search index, the fourth search index segment generated by the search engine and including one or more third inverted index segments, the third heap of the memory being released based on at least one of a determination that a threshold of amount of memory has been consumed or a determination that a threshold number of search index segments are stored. 2. The method of claim 1 , comprising: storing in the memory, by the search engine, per-query objects in a fourth heap of the memory that is separate from the first heap, the second heap, and the third heap. 3. The method of claim 1 , wherein the first set of documents corresponds to a first generation of documents and the second set of documents corresponds to a second generation of documents. 4. The method of claim 1 , wherein the third heap comprises the new heap and the fourth search index segment comprises the third search index segment. 5. The method of claim 1 , wherein the second search index segment is generated and stored in real-time as documents are obtained, received, or processed. 6. The method of claim 1 , wherein the first heap of the memory is associated with a first core of a central processing unit (CPU), and wherein the second heap of the memory is associated with a second core of the CPU. 7. The method of claim 1 , comprising: storing, by a central processing unit (CPU) core, an object in one pool of two or more pools associated with the CPU core according to a size of the object. 8. The method of claim 7 , wherein each pool of the two or more pools is associated with at least one range of two or more ranges of object sizes. 9. A system, comprising: one or more processors; and at least some memory, at least one of the one or more processors or the at least some memory being configured to: generate and store in memory, by a search engine, a first search index segment of a search index, the first search index segment being stored in a first heap of the memory, the first search index segment including one or more first inverted index segments mapping content to a first set of documents, the search index being an inverted index; generate and store in the memory, by the search engine, a second search index segment of the search index, the second search index segment being stored in a second heap of the memory concurrent with the first heap of the memory, the second search index segment including one or more second inverted index segments mapping content to a second set of documents, the second set of documents being more recently processed by the search engine than the first set of documents; update the second search index segment until a threshold corresponding to a number of documents or an amount of memory has been reached; responsive to determining that the threshold corresponding to the number of documents or the amount of memory has been reached: cut off a current generation of documents associated with the second heap of the memory by changing a status of the second search index segment being stored in the second heap of the memory from a read/write status to a read-only status; and initiate a next generation of documents by generating a third search index segment of the search index, wherein the third search index segment is assigned the read/write status and is stored in a new heap of the memory associated with the next generation of documents; and release, by a garbage collection module, a third heap of the memory, the third heap of the memory storing a fourth search index segment of the search index, the fourth search index segment generated by the search engine and including one or more third inverted index segments, the third heap of the memory being released based on a determination that a threshold of amount of memory has been consumed. 10. The system of claim 9 , at least one of the one or more processors or the memory being configured to: store, by the search engine, per-query objects in a fourth heap of the memory that is separate from the first heap, the second heap, and the third heap. 11. The system of claim 9 , wherein the first set of documents corresponds to a first generation of documents and the second set of documents corresponds to a second generation of documents. 12. The system of claim 9 , wherein the second search index segment is generated and stored in real-time as documents are obtained, received, or processed. 13. The system of claim 9 , wherein the first heap comprises two or more sub-heaps. 14. The system of claim 9 , wherein the first heap comprises two or more sub-heaps, wherein each one of the two or more sub-heaps is associated with at least one of two or more ranges of object sizes. 15. The method of claim 13 , each of the two or more sub-heaps being associated with at least one central processing unit (CPU) core. 16. A non-transitory computer-readable storage medium storing thereon computer-readable storage medium, comprising: instructions configured to generate and store in memory, by a search engine, a first search index segment of a search index, the first search index segment being stored in a first heap of the memory, the first search index segment including one or more first inverted index segments mapping content to a first set of documents, the search index being an inverted index; instructions configured to generate and store in the memory, by the search engine, a second search index segment of the search index, the second search index segment being stored in a second heap of the memory concurrent with the first heap of the memory, the second search index segment including one or more second inverted index segments mapping content to a second set of documents, the second set of documents being more recently processed by the search engine than the first set of documents; instructions configured to upd

Assignees

Inventors

Classifications

  • Garbage collection, i.e. reclamation of unreferenced memory · CPC title

  • Energy efficient computing, e.g. low power processors, power management or thermal management · CPC title

  • Latency reduction · CPC title

  • Search customisation based on user profiles and personalisation · CPC title

  • Indexing structures · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11030262B2 cover?
In one embodiment, a search engine may generate and store a plurality of search index segments such that each of the search index segments is stored in a corresponding one of a plurality of heaps of memory. The plurality of search index segments may include inverted index segments mapping content to documents containing the content. A garbage collection module may release one or more heaps of t…
Who is the assignee on this patent?
Oath Inc, Verizon Media Inc
What technology area does this patent fall under?
Primary CPC classification G06F12/0253. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 08 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).