Elastic resource scaling
US-9225724-B2 · Dec 29, 2015 · US
US9514172B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9514172-B2 |
| Application number | US-201213595270-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 27, 2012 |
| Priority date | Jun 10, 2009 |
| Publication date | Dec 6, 2016 |
| Grant date | Dec 6, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In embodiments of the disclosed technology, indexes, such as inverted indexes, are updated only as necessary to guarantee answer precision within predefined thresholds which are determined with little cost in comparison to the updates of the indexes themselves. With the present technology, a batch of daily updates can be processed in a matter of minutes, rather than a few hours for rebuilding an index, and a query may be answered with assurances that the results are accurate or within a threshold of accuracy.
Opening claim text (preview).
We claim: 1. A method for limiting updates of inverted indexes in a database, the method comprising: calculating a first inverse document frequency for each n-gram in a plurality of strings; calculating a first length for each string in the plurality of strings; for each n-gram, creating an inverted list comprising strings containing the n-gram; sorting each inverted list by lengths of the strings in the inverted list; receiving an update to the plurality of strings; calculating a second inverse document frequency of n-grams affected by the update; for each respective n-gram affected by the update: calculating an error for the respective n-gram based on the second inverse document frequency; and in response to determining that the error for the respective n-gram satisfies a predefined error threshold for the respective n-gram, wherein the predefined error threshold for the respective n-gram is determined based on a frequency of the respective n-gram in the plurality of strings: calculating a second length for each string affected by the update, and re-sorting each inverted list affected by the update based on the second length. 2. The method of claim 1 , further comprising: updating each inverted list affected by the update to generate a respective updated affected inverted list. 3. The method of claim 2 , further comprising: answering a query based on the respective updated affected inverted list. 4. The method of claim 1 , wherein the update is an addition of a new string. 5. The method of claim 1 , wherein the update is a deletion of a portion of an existing string. 6. The method of claim 1 , wherein the update is a modification of a portion of a string. 7. The method of claim 1 , wherein each update is buffered and propagated in a batch at regular time intervals. 8. An apparatus comprising: a processor; and a memory to store computer program instructions, the computer program instructions when executed on the processor, cause the processor to perform operations comprising: calculating a first inverse document frequency for each n-gram in a plurality of strings; calculating a first length for each string in the plurality of strings; for each n-gram, creating an inverted list comprising strings containing the n-gram; sorting each inverted list by lengths of the strings in the inverted list; receiving an update to the plurality of strings; calculating a second inverse document frequency of n-grams affected by the update; for each respective n-gram affected by the update: calculating an error for the respective n-gram based on the second inverse document frequency; and in response to determining that the error for the respective n-gram satisfies a predefined error threshold for the respective n-gram, wherein the predefined error threshold for the respective n-gram is determined based on a frequency of the respective n-gram in the plurality of strings: calculating a second length for each string affected by the update, and re-sorting each inverted list affected by the update based on the second length. 9. The apparatus of claim 8 , wherein the update is an addition of a new string. 10. The apparatus of claim 8 , wherein the update is a deletion of a portion of a string. 11. The apparatus of claim 8 , wherein the update is a modification of a portion of a string. 12. The apparatus of claim 8 , wherein each update is buffered and propagated in a batch at regular time intervals. 13. A non-transitory computer readable storage device storing computer program instructions, which, when executed on a processor, cause the processor to perform operations comprising: calculating a first inverse document frequency for each n-gram in a plurality of strings; calculating a first length for each string in the plurality of strings; for each n-gram, creating an inverted list comprising strings containing the n-gram; sorting each inverted list by lengths of the strings in the inverted list; receiving an update to the plurality of strings; calculating a second inverse document frequency of n-grams affected by the update; for each respective n-gram affected by the update: calculating an error for the respective n-gram based on the second inverse document frequency; and in response to determining that the error for the respective n-gram satisfies a predefined error threshold for the respective n-gram, wherein the predefined error threshold for the respective n-gram is determined based on a frequency of the respective n-gram in the plurality of strings: calculating a second length for each string affected by the update, and re-sorting each inverted list affected by the update based on the second length. 14. The non-transitory computer readable storage device of claim 13 , wherein the update is an addition of a new string. 15. The non-transitory computer readable storage device of claim 13 , wherein the update is a deletion of a portion of a string. 16. The non-transitory computer readable storage device of claim 13 , wherein the update is a modification of a portion of a string. 17. The non-transitory computer readable storage device of claim 13 , wherein each update is buffered and propagated in a batch at regular time intervals.
Physics · mapped topic
Management thereof · CPC title
Inverted lists · CPC title
Management therefor · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.