Method and system for ranking search content

US2016335263A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016335263-A1
Application numberUS-201514959122-A
CountryUS
Kind codeA1
Filing dateDec 4, 2015
Priority dateMay 15, 2015
Publication dateNov 17, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present teaching relates to ranking search content. In one example, a plurality of documents is received to be ranked with respect to a query. Features are extracted from the query and the plurality of documents. The plurality of documents is ranked based on a ranking model and the extracted features. The ranking model is derived to remove one or more documents from the plurality of documents that are less relevant to the query and order remaining documents based on their relevance to the query. The ordered remaining documents are provided as a search result with respect to the query.

First claim

Opening claim text (preview).

We claim: 1 . A method, implemented on a machine having at least one processor, storage, and a communication platform connected to a network for ranking search content, comprising: receiving a plurality of documents to be ranked with respect to a query; extracting features from the query and the plurality of documents; ranking the plurality of documents based on a ranking model and the extracted features, wherein the ranking model is derived to remove one or more documents from the plurality of documents that are less relevant to the query and order remaining documents based on their relevance to the query; and providing the ordered remaining documents as a search result with respect to the query. 2 . The method of claim 1 , further comprising: ranking the plurality of documents based on a score representing a degree of relevance between the query and each of the plurality of documents, wherein the score is calculated based on the ranking model and the extracted features; and filtering out the one or more documents from the plurality of documents that have scores less than a predetermined threshold. 3 . The method of claim 1 , wherein the features extracted from the query and the plurality of documents represent at least one of the following: a user profile associated with a user submitting the query; a popularity online for each of the plurality of documents, a textual relevance between each document and the query; and the user's typical click behavior regarding a document and the query. 4 . The method of claim 1 , further comprising training the ranking model with a logistic loss function based on train data related to a plurality of query/URL pairs stored in a database. 5 . The system of claim 4 , wherein training the ranking model comprises: obtaining assessment data associated with the plurality of query/URL pairs; determining a target score for each of the plurality of query/URL pairs based on the assessment data to classify the plurality of query/URL pairs; retrieving features of each of the plurality of query/URL pairs from the database; and training the ranking model based on the target score and the feature of each of the plurality of query/URL pairs. 6 . The method of claim 5 , wherein the plurality of query/URL pairs is classified into two groups: a first group including query/URL pairs each of which has a first target score representing a higher relevance between query and URL in the pair, and a second group including query/URL pairs each of which has a second target score representing a lower relevance between query and URL in the pair. 7 . The method of claim 1 , wherein the ranking model is trained based on a gradient boosting algorithm. 8 . A system having at least one processor, storage, and a communication platform connected to a network for ranking search content, comprising: a query and document analyzer configured for receiving a plurality of documents to be ranked with respect to a query; a feature extractor configured for extracting features from the query and the plurality of documents; a search result ranking unit configured for ranking the plurality of documents based on a ranking model and the extracted features, wherein the ranking model is derived to remove one or more documents from the plurality of documents that are less relevant to the query and order remaining documents based on their relevance to the query; and a search result filter configured for providing the ordered remaining documents as a search result with respect to the query. 9 . The system of claim 8 , wherein: the search result ranking unit is configured for ranking the plurality of documents based on a score representing a degree of relevance between the query and each of the plurality of documents, wherein the score is calculated based on the ranking model and the extracted features; and the search result filter is configured for filtering out the one or more documents from the plurality of documents that have scores less than a predetermined threshold. 10 . The system of claim 8 , wherein the features extracted from the query and the plurality of documents represent at least one of the following: a user profile associated with a user submitting the query; a popularity online for each of the plurality of documents, a textual relevance between each document and the query; and the user's typical click behavior regarding a document and the query. 11 . The system of claim 8 , further comprising a ranking model training engine configured for training the ranking model with a logistic loss function based on train data related to a plurality of query/URL pairs stored in a database. 12 . The system of claim 11 , wherein the ranking model training engine comprises: an assessment obtaining unit configured for obtaining assessment data associated with the plurality of query/URL pairs; a target score determiner configured for determining a target score for each of the plurality of query/URL pairs based on the assessment data to classify the plurality of query/URL pairs; a feature retriever configured for retrieving features of each of the plurality of query/URL pairs from the database; and a ranking model training unit configured for training the ranking model based on the target score and the feature of each of the plurality of query/URL pairs. 13 . The system of claim 12 , wherein the plurality of query/URL pairs is classified into two groups: a first group including query/URL pairs each of which has a first target score representing a higher relevance between query and URL in the pair, and a second group including query/URL pairs each of which has a second target score representing a lower relevance between query and URL in the pair. 14 . The system of claim 8 , wherein the ranking model is trained based on a gradient boosting algorithm. 15 . A machine-readable, non-transitory and tangible medium having information recorded thereon for ranking search content, the information, when read by the machine, causes the machine to perform the following: receiving a plurality of documents to be ranked with respect to a query; extracting features from the query and the plurality of documents; ranking the plurality of documents based on a ranking model and the extracted features, wherein the ranking model is derived to remove one or more documents from the plurality of documents that are less relevant to the query and order remaining documents based on their relevance to the query; and providing the ordered remaining documents as a search result with respect to the query. 16 . The medium of claim 15 , wherein the information, when read by the machine, further causes the machine to perform the following: ranking the plurality of documents based on a score representing a degree of relevance between the query and each of the plurality of documents, wherein the score is calculated based on the ranking model and the extracted features; and filtering out the one or more documents from the plurality of documents that have scores less than a predetermined threshold. 17 . The medium of claim 15 , wherein the features extracted from the query and the plurality of documents represent at least one of the following: a user profile associated with a user submitting the query; a popularity online for each of the plurality of documents, a textual relevance between each document and the query; and the user's typical click behavior regarding a document and the query. 18 . The medium of claim 15 , wherein

Assignees

Inventors

Classifications

  • Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title

  • Search customisation based on user profiles and personalisation · CPC title

  • using ranking · CPC title

  • Ensemble learning · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016335263A1 cover?
The present teaching relates to ranking search content. In one example, a plurality of documents is received to be ranked with respect to a query. Features are extracted from the query and the plurality of documents. The plurality of documents is ranked based on a ranking model and the extracted features. The ranking model is derived to remove one or more documents from the plurality of documen…
Who is the assignee on this patent?
Yahoo Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/24578. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Nov 17 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 11 related publications on this page (citations in our corpus or others sharing the same primary CPC).