κ-selection using parallel processing

US10649770B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10649770-B2
Application numberUS-201715608593-A
CountryUS
Kind codeB2
Filing dateMay 30, 2017
Priority dateJan 31, 2017
Publication dateMay 12, 2020
Grant dateMay 12, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In one embodiment, a method includes accessing a query vector; accessing object vectors; determining input distances corresponding to a distance between the query vector and the object vectors; accessing thread queues; accessing a warp queue; for each of the input distance values: selecting one of the thread queues, when the input distance value is less than a greatest one of the distance values stored in the selected thread queue, inserting the input distance value into the thread queues and ejecting the greatest distance values stored in the thread queue, and when a greatest distance value stored in any of the thread queues is less than a greatest distance value stored in the warp queue, merging the thread queue with the warp queue; identifying the objects represented by an object vector corresponding to the distance values stored in the warp queue; and providing the search results for presentation.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method comprising: by one or more computing devices, accessing a query vector representing a search query entered by a user; by one or more computing devices, accessing a plurality of object vectors that each represent one of a plurality of objects; by one or more computing devices, determining plurality of input distance values that each correspond to a distance between the query vector and one of the object vectors; by one or more computing devices, accessing a plurality of thread queues that initially store thread-queue distance values; by one or more computing devices, accessing a plurality of warp queues in a block, wherein: the warp queues initially store warp-queue distance values; and the block further comprises shared memory for the warp queues; by one or more computing devices, for each of the input distance values: selecting one of the warp queues; selecting one of the thread queues corresponding to the selected one of the warp queues; when the input distance value is less than a greatest one of the distance values stored in the selected one of the thread queues, inserting the input distance value into the selected one of the thread queues and ejecting the greatest one of the distance values stored in the selected one of the thread queues; and when a greatest one of the distance values stored in any of the thread queues is less than a greatest one of the distance values stored in the selected one of the warp queues, merging, within the shared memory in the block, the selected one of the thread queues with the selected one of the warp queues using an odd-size merging network in parallel with one or more other thread queues corresponding to the selected one of the warp queues; by one or more computing devices, identifying one or more objects that are each represented by an object vector corresponding to one of the distance values stored in the warp queues; and by one or more computing devices, providing for presentation to the user one or more search results corresponding to one or more of the identified objects. 2. The method of claim 1 , wherein each of one or more of the warp queues is stored as a lane-stride register array. 3. The method of claim 1 , wherein each of one or more of the thread queues is stored in register memory of a graphics processing unit. 4. The method of claim 1 , wherein each of one or more of the warp queues is stored in shared memory of a graphics processing unit. 5. The method of claim 1 , wherein each of one or more of the warp queues is stored in register memory of a graphics processing unit. 6. The method of claim 1 , wherein accessing the query vector representing the search query entered by the user comprises determining the query vector based on the search query. 7. The method of claim 1 , wherein each object vector is a quantized vector. 8. The method of claim 1 , wherein the each warp queue is a wavefront queue. 9. The method of claim 1 , wherein each thread queue is operated on by a single thread of execution of a processor of the computing device. 10. The method of claim 1 , wherein: each object of the plurality of objects corresponds to at least one node in a graph of a social-networking system; the graph comprises a plurality of nodes and edges connecting the nodes; and the identified objects correspond to nodes in the graph that are responsive to the search query entered by the user. 11. The method of claim 10 , wherein at least one of the plurality of object vectors represents at least one edge connecting the node corresponding to the object vector to another node in the graph. 12. The method of claim 11 , wherein the at least one edge represents one or more of: a friendship; a family relationship; a business or employment relationship; a fan relationship; a follower relationship; a visitor relationship; or a subscriber relationship. 13. The method of claim 1 , wherein: each initially-stored thread-queue distance value is a maximum sentinel distance value; and each initially-stored warp-queue distance value is a maximum sentinel distance value. 14. One or more computer-readable non-transitory storage media embodying software that is operable when executed to: access a query vector representing a search query entered by a user; access a plurality of object vectors that each represent one of a plurality of objects; determine a plurality of input distance values that each correspond to a distance between the query vector and one of the object vectors; access a plurality of thread queues that initially store thread-queue distance values; access a plurality of warp queues in a block, wherein: the warp queues initially store warp-queue distance values; and the block further comprises shared memory for the warp queues; for each of the input distance values: select one of the warp queues; select one of the thread queues corresponding to the selected one of the warp queues; when the input distance value is less than a greatest one of the distance values stored in the selected one of the thread queues, insert the input distance value into the selected one of the thread queues and eject the greatest one of the distance values stored in the selected one of the thread queues; and when a greatest one of the distance values stored in any of the thread queues is less than a greatest one of the distance values stored in the selected one of the warp queues, merge, within the shared memory in the block, the selected one of the thread queues with the selected one of the warp queues using an odd-size merging network in parallel with one or more other thread queues corresponding to the selected one of the warp queues; identify one or more objects that are each represented by an object vector corresponding to one of the distance values stored in the warp queues; and provide for presentation to the user one or more search results corresponding to one or more of the identified objects. 15. The media of claim 14 , wherein each of one or more of the warp queues is stored as a lane-stride register array. 16. The media of claim 14 , wherein each of one or more of the thread queues is stored in register memory of a graphics processing unit. 17. The media of claim 14 , wherein each of one or more of the warp queues is stored in shared memory of a graphics processing unit. 18. The media of claim 14 , wherein each of one or more of the warp queues is stored in register memory of a graphics processing unit. 19. The media of claim 14 , wherein accessing the query vector representing the search query entered by the user comprises determining the query vector based on the search query. 20. The media of claim 14 , wherein each object vector is a quantized vector. 21. The media of claim 14 , wherein each warp queue is a wavefront queue. 22. The media of claim 14 , wherein each thread queue is operated on by a single thread of execution of a processor of the computing device. 23. The media of claim 14 , wherein: each initially-stored thread-queue distance value is a maximum sentinel distance value; and each initially-stored warp-queue distance value is a maximum sentinel distance value. 24. A system comprising: one or more processors at a first client computing device; and a memory at the first client computing device coupled to the processors and comprising instructions operable when executed by the processors to cause the pr

Assignees

Inventors

Classifications

  • G06F9/28Primary

    Enhancement of operational speed, e.g. by using several microcontrol devices operating in parallel · CPC title

  • Updating · CPC title

  • G06F16/437Primary

    Administration of user profiles, e.g. generation, initialisation, adaptation, distribution · CPC title

  • having a sequence of storage locations each being individually accessible for both enqueue and dequeue operations, e.g. using a RAM · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10649770B2 cover?
In one embodiment, a method includes accessing a query vector; accessing object vectors; determining input distances corresponding to a distance between the query vector and the object vectors; accessing thread queues; accessing a warp queue; for each of the input distance values: selecting one of the thread queues, when the input distance value is less than a greatest one of the distance value…
Who is the assignee on this patent?
Facebook Inc
What technology area does this patent fall under?
Primary CPC classification G06F9/28. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 12 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).