Next generation near real-time indexing

US9805078B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9805078-B2
Application numberUS-201314145414-A
CountryUS
Kind codeB2
Filing dateDec 31, 2013
Priority dateDec 31, 2012
Publication dateOct 31, 2017
Grant dateOct 31, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and systems to build and utilize a search infrastructure are described. The system generates index information, including document updates and indexes. The system receives event notifications as the document updates are received and accumulates the index information until published. A query engine receives a search query from a client machine and identifies search results based on the query and the index information. The system communicates the search results, over the network, to the client machine.

First claim

Opening claim text (preview).

We claim: 1. A computer system to rebuild indexing information in near real time, the computer system comprising: processors; a plurality of query node servers; a non-transitory first indexing subsystem when executing on at least one processor among the processors cause the first indexing subsystem to generate full indexes to publish document updates by the plurality of query node servers via a distribution system, the first indexing system representing a map-reduced indexing system; a non-transitory second indexing subsystem when executing on at least one processor among the processors cause the second indexing subsystem to generate mini-indexes associated with the full indexes to publish document updates by the plurality of query node servers indirectly via the distribution system or directly to the plurality of query node servers, the second indexing subsystem representing a daemon-based indexing system that monitors for and receives in near real-time event notifications and generates the mini-indexes in near real-time based on receiving the event notifications, the event notifications based on a priority ordering of events by application servers; a non-transitory query engine when executing on at least one processor among the processors cause the computer system to: update the indexing information at the plurality of query node servers based on the mini-indexes and the full indexes; and publish the updated indexing information to one or more publishing channels in the distribution system based on a normal priority document updates, and publish the updated indexing information to the query nodes servers based on a higher priority document update. 2. The system of claim 1 , wherein the second indexing subsystem further comprises a message queuing subsystem having a message queue for the event notifications of the document updates published by application servers, the message queuing subsystem subscribed to by the second indexing subsystem. 3. The system of claim 2 , wherein the second indexing subsystem further comprises a daemon coordinator to coordinate the processing of the event notifications of the document updates in the message queue by indexing daemons. 4. The system of claim 1 , wherein the second indexing subsystem further comprises a process document module for retrieving at least a portion of the index information from a database and an accumulator module for generating accumulated index information to be consumed and served by the plurality of query node servers. 5. The system of claim 4 , wherein the database is a Hadoop database. 6. A computer implemented method for rebuilding indexing information in near real time, the method comprising: generating, by a first indexing subsystem, full indexes to sending full indexes by the first indexing system by a plurality of query node servers via a distribution system, the first indexing system representing a map-reduced indexing system; generating, by a second indexing subsystem, mini-indexes associated with the full indexes; sending the mini-indexes, by the second indexing system, indirectly via the distribution system or directly to the plurality of query node servers to publish the document updates by the plurality of query node servers, the second indexing subsystem representing a daemon-based indexing system that monitors for and receives in near real-time event notifications, the event notifications based on a priority ordering of events by application servers; updating the indexing information at the plurality of query node servers based on the mini-indexes and the full indexes; and publishing the updated indexing information to one or more publishing channels in the distribution system based on a normal priority document update, and publish the updated indexing information to the query nodes servers based on a higher priority document update. 7. The method of claim 6 , further comprising: updating index information associated with normal priority document updates associated with the full-indexes and high priority document updates associated with the mini-indexes at the plurality of query node servers; receiving a search query, over a network, from a client machine and identifying search results based on the query and the updated index information; and communicating the search results, over the network, to the client machine. 8. The method of claim 6 , further comprising processing the event notifications of the document updates in a message queue by indexing daemons and a daemon coordinator. 9. The method of claim 8 , further comprising receiving the event notifications of the document updates as re-ordered based on the priority ordering of events by the application servers. 10. The method of claim 6 , wherein retrieving the index information further comprises retrieving the index information from a Hadoop database. 11. The system of claim 1 , wherein the mini-indexes includes information identifying which of the mini-indexes is paired with which of the full indexes. 12. The system of claim 1 , wherein the application servers prioritizes business event streams based on event types. 13. The system of claim 1 , wherein the application servers generates a dual stream of output data, the dual stream including event notifications received by the second indexing system and actual data included in the events received by the first indexing system. 14. The system of claim 1 , wherein the application servers processes events that has been re-ordered based on a priority scheme. 15. The system of claim 1 , wherein the second indexing system delivers the mini-indexes to one or more channels. 16. The system of claim 1 , wherein the first indexing system generates the full indexes at a fixed schedule for the full indexes and the second indexing system generates the mini-indexes on-demand or at a fixed schedule for the mini-indexes.

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9805078B2 cover?
Methods and systems to build and utilize a search infrastructure are described. The system generates index information, including document updates and indexes. The system receives event notifications as the document updates are received and accumulates the index information until published. A query engine receives a search query from a client machine and identifies search results based on the q…
Who is the assignee on this patent?
Ebay Inc, Ebay Inc
What technology area does this patent fall under?
Primary CPC classification G06F17/30321. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 31 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).