Automatically linking pages in a website

US10990643B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10990643-B2
Application numberUS-201815941261-A
CountryUS
Kind codeB2
Filing dateMar 30, 2018
Priority dateMar 30, 2018
Publication dateApr 27, 2021
Grant dateApr 27, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques for automatically linking pages in a web site are provided. In one technique, training data for a machine-learned scoring model is generated that comprises a plurality of features related to content items. The training data comprises multiple entries, each corresponding to a different content item in a first set of content items. For each entry, a corresponding label is based on a ranking of the corresponding content item in one or more search engine results. The machine-learned scoring model is trained based on the training data. For each content item in a second set of content items, multiple attribute values associated with that content item are input into the machine-learned scoring model, which generates a result. Based on multiple results, determining, for a particular web page, a strict subset of the second set of content items to which the particular web page will include one or more links.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: generating training data for a machine-learned scoring model that comprises a plurality of features related to content items, wherein the training data comprises a plurality of entries, each corresponding to a different content item of a first plurality of content items; wherein, for each entry of the plurality of entries, a label in said each entry that corresponds to a content item is based on a ranking of said content item in one or more search engine results; training the machine-learned scoring model based on the training data; for each content item of a second plurality of content items, inputting, into the machine-learned scoring model, multiple attribute values associated with said each content item, wherein the machine-learned scoring model generates a result for said each content item based on the multiple attribute values, wherein the result indicates a score of said each content item; based on a plurality of results generated by the machine-learned scoring model for the second plurality of content items, determining, for a particular web page, a strict subset of the second plurality of content items to which the particular web page will include one or more links; including the one or more links in the particular web page; wherein the method is performed by one or more computing devices. 2. The method of claim 1 , wherein the plurality of features is based on two or more of: a bounce rate of a content item, search volume of the content item, average staying time of the content item, number of unique visitors of the content item, freshness of the content item, number of URLs in the content item, number of interactions that visitors have with the content item, number of links in the content item, number of named entities in the content item, current search ranking of the content item, current search ranking page of the content item, or number of internal backlinks of the content item. 3. The method of claim 1 , wherein the plurality of features is based on one or more of: a content relevance between a content item and a source page that contains a link to the content item, selection rate between the source page and the content item, a pair-wise bounce rate of the content item and the source page, number of distinct users that requested the content item through the source page. 4. The method of claim 1 , wherein the strict subset is a first strict subset, further comprising: determining, for a second web page that is different than the particular web page, a second strict subset of the plurality of content items to which the second web page will include one or more second links; including the one or more seconds links in the second web page; wherein the second strict subset is different than the first strict subset. 5. The method of claim 1 , wherein one of the multiple attribute values associated with said each content item of the plurality of content items corresponds to an attribute based on a combination of said each content item and the particular web page. 6. The method of claim 1 , further comprising: determining, in a first search result, a first ranking of a particular content item in the first plurality of content items; determining, in a second search result that was generated after the first search result, a second ranking of the particular content item; based on a difference between the first ranking and the second ranking, generating a particular label for the particular content item; including the particular label in the training data prior to training the machine-learned scoring model based on the training data. 7. The method of claim 1 , further comprising: for each content item of a third plurality of content items, inputting, into a second machine-learned scoring model, a plurality of attribute values associated with said each content item, wherein the second machine-learned scoring model generates a result for said each content item based on the plurality of attribute values; based on a second plurality of results generated by the second machine-learned scoring model for the third plurality of content items, determining, for the particular web page, a strict subset of the third plurality of content items to which the particular web page will include one or more second links; including the one or more second links in the particular web page. 8. The method of claim 7 , wherein the second machine-learned scoring model is different than the machine-learned scoring model. 9. The method of claim 7 , wherein the second plurality of content items contain data of a first type and the third plurality of content items contain data of a second type that is different than the first type and do not contain data of the first type. 10. The method of claim 1 , further comprising: generating second training data for a second machine-learned scoring model that comprises a second plurality of features related to content items, wherein the second training data comprises a second plurality of entries, each corresponding to a different content item of a third plurality of content items; wherein, for each entry of the second plurality of entries, a label in said each entry that corresponds to a content item, in the third plurality of content items, is based on a ranking of said content item in one or more second search engine results; training the second machine-learned scoring model based on the second training data; for each content item of a fourth plurality of content items, inputting, into the second machine-learned scoring model, a plurality of attribute values associated with said each content item, wherein the second machine-learned scoring model generates a result for said each content item based on the plurality of attribute values; based on a second plurality of results generated by the second machine-learned scoring model for the fourth plurality of content items, determining, for a second web page, a strict subset of the fourth plurality of content items to which the second web page will include one or more second links; including the one or more second links in the second web page. 11. The method of claim 1 , further comprising making available, on a website, the particular web page and content items in the strict subset. 12. A method comprising: for each content item of a first plurality of content items, inputting, into a first scoring model, a first plurality of attribute values associated with said each content item, wherein the first scoring model generates a result for said each content item based on the first plurality of attribute values, wherein the result indicates a score of said each content item; based on a first plurality of results generated by the first scoring model for the first plurality of content items, determining, for a particular web page, a strict subset of the first plurality of content items to which the particular web page will include one or more first links; including the one or more first links in the particular web page; for each content item of a second plurality of content items, inputting, into a second scoring model that is different than the first scoring model, a second plurality of attribute values associated with said each content item, wherein the second scoring model generates a result for said each content item based on the second plurality of attribute values; based on a second plurality of results generated by the second scoring model for the second plurality of content items, determining, for the particular web page, a strict subset of the second plurality of content items to which the particular web page will include one or more second links;

Assignees

Inventors

Classifications

  • Forward inferencing; Production systems · CPC title

  • G06N20/20Primary

    Ensemble learning · CPC title

  • G06F16/958Primary

    Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking · CPC title

  • Search customisation based on user profiles and personalisation · CPC title

  • Machine learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10990643B2 cover?
Techniques for automatically linking pages in a web site are provided. In one technique, training data for a machine-learned scoring model is generated that comprises a plurality of features related to content items. The training data comprises multiple entries, each corresponding to a different content item in a first set of content items. For each entry, a corresponding label is based on a ra…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06N20/20. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 27 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).