Modeling interestingness with deep neural networks

US9846836B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9846836-B2
Application numberUS-201414304863-A
CountryUS
Kind codeB2
Filing dateJun 13, 2014
Priority dateJun 13, 2014
Publication dateDec 19, 2017
Grant dateDec 19, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An “Interestingness Modeler” uses deep neural networks to learn deep semantic models (DSM) of “interestingness.” The DSM, consisting of two branches of deep neural networks or their convolutional versions, identifies and predicts target documents that would interest users reading source documents. The learned model observes, identifies, and detects naturally occurring signals of interestingness in click transitions between source and target documents derived from web browser logs. Interestingness is modeled with deep neural networks that map source-target document pairs to feature vectors in a latent space, trained on document transitions in view of a “context” and optional “focus” of source and target documents. Network parameters are learned to minimize distances between source documents and their corresponding “interesting” targets in that space. The resulting interestingness model has applicable uses, including, but not limited to, contextual entity searches, automatic text highlighting, prefetching documents of likely interest, automated content recommendation, automated advertisement placement, etc.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented process, comprising: applying a computer to perform process actions for: receiving a collection of source and target document pairs; identifying a separate context for each source document, the context for each source document comprising a selection within the source document and a window of multiple words in the source document around that selection; identifying a separate context for each target document, the context for each target document comprising a first fixed number of the first words in that target document; mapping each context to a separate vector; mapping each of the vectors to a convolutional layer of a neural network; mapping the convolutional layer to a plurality of hidden layers of the neural network; generating a learned interestingness model by learning weights for each of a plurality of transitions between the layers of the neural network, such that the learned weights minimize a distance between the vectors of the contexts of the source and target documents; the interestingness model configured to determine a conditional likelihood of a user interest in transitioning to an arbitrary target document when that user is consuming an arbitrary source document in view of a context extracted from that arbitrary source document and a context extracted from that arbitrary target document; and applying the interestingness model to recommend one or more arbitrary target documents to the user relative to an arbitrary source document being consumed by the user. 2. The computer-implemented process of claim 1 further comprising: identifying a focus for each source document and each target document; and wherein the separate vectors are constructed by mapping the focus and context of each source document and each target document to the separate vectors. 3. The computer-implemented process of claim 2 wherein the focus of a source document is one or more selected words in the source document. 4. The computer-implemented process of claim 2 wherein the focus of one or more of the target documents is a fixed number of words at the beginning of the target document. 5. The computer-implemented process of claim 1 further comprising applying the learned interestingness model to one or more arbitrary source documents to extract semantic features from those arbitrary source documents. 6. The computer-implemented process of claim 1 further comprising applying the learned interestingness model to one or more arbitrary target documents to extract semantic features from those arbitrary target documents. 7. The computer-implemented process of claim 1 further comprising generating feature vectors from an output layer of the learned interestingness model, and applying those feature vectors as input to train a discriminative model. 8. The computer-implemented process of claim 7 wherein the discriminative model is a boosted tree ranker trained by performing a plurality of iterations of boosting rounds, with each round constructing a regression tree. 9. The computer-implemented process of claim 7 wherein the discriminative model is used to automatically highlight interesting content in an arbitrary document being consumed by the user. 10. The computer-implemented process of claim 7 wherein the discriminative model is used to automatically perform contextual entity searches, for one or more entities automatically identified in an arbitrary document being consumed by the user, for entities likely to be of interest to the user. 11. The computer-implemented process of claim 7 wherein the discriminative model is used to automatically prefetch one or more documents likely to be of interest to a user consuming an arbitrary document. 12. The computer-implemented process of claim 7 wherein the discriminative model is used to automatically recommend one or more items that are likely to be of interest to a user consuming an arbitrary document. 13. The computer-implemented process of claim 1 wherein the neural network is constructed from layers comprising: an input layer comprising vectors derived from the context; the convolutional layer connected to the input layer via a first linear projection matrix, the convolutional layer extracting semantic features from the vectors of the input layer; a max pooling layer connected to the convolutional layer via a max pooling operation; the plurality of hidden layers connected to the max pooling layer via a second linear projection matrix; and an output layer connected to the plurality of hidden layers via a third linear projection matrix. 14. The computer-implemented process of claim 1 wherein the context of one or more of the source documents is one or more anchors in combination with a window of words around the anchor. 15. The computer-implemented process of claim 1 wherein the context of one or more of the source documents is a predefined size window of words around each of a plurality of entities identified in those source documents. 16. A system comprising: a general purpose computing device; and a computer program comprising program modules executable by the computing device, wherein the computing device is directed by the program modules of the computer program to: receive a collection of source and target document pairs; identify a separate focus and a separate context for each source document and each target document; the context of each source document comprising a selection of one or more words within the source document and a window of multiple words in the source document around that selection; the focus of each source document comprising a selected anchor within the source document; the context of each target document comprising a first fixed number of the first words in that target document; the focus of each target document comprising a second fixed number of the first words in that target document, the second fixed number being smaller than the first fixed number; map the words of each focus to a separate vector and the words of each context to a separate vector; for each document, concatenate the corresponding focus and context vectors into a combined vector; map each of the combined vectors to a convolutional layer of a neural network; map the convolutional layer to a hidden layer of the neural network; generate a learned interestingness model by learning weights for each of a plurality of transitions between the layers of the neural network, such that the learned weights minimize a distance between the combined vectors of the source and target documents; the interestingness model configured to determine a conditional likelihood of a user interest in transitioning to an arbitrary target document when that user is consuming an arbitrary source document in view of a context extracted from that arbitrary source document and a context extracted from that arbitrary target document; and applying the interestingness model to recommend one or more arbitrary target documents to the user relative to an arbitrary source document being consumed by the user. 17. The system of claim 16 further comprising generating feature vectors from an output layer of the learned interestingness model, and applying those feature vectors as input to train a discriminative model. 18. The system of claim 16 wherein mapping the words of each focus to a separate vector further comprises forming a one-hot vector and a tri-letter vector for each word in each focus. 19. A computer-readable storage device having computer executable instructio

Assignees

Inventors

Classifications

  • Knowledge-based neural networks; Logical representations of neural networks · CPC title

  • Query formulation · CPC title

  • Combinations of networks · CPC title

  • Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title

  • modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9846836B2 cover?
An “Interestingness Modeler” uses deep neural networks to learn deep semantic models (DSM) of “interestingness.” The DSM, consisting of two branches of deep neural networks or their convolutional versions, identifies and predicts target documents that would interest users reading source documents. The learned model observes, identifies, and detects naturally occurring signals of interestingness…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/9032. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 19 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).