Who is the assignee on this patent?

Atlassian Pty Ltd, Atlassian Inc

What technology area does this patent fall under?

Primary CPC classification H04L41/5074. Mapped technology areas include Electricity.

When was this patent published?

Publication date Tue Apr 06 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Content discovery systems and methods

US10970314B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10970314-B2
Application number	US-201916370776-A
Country	US
Kind code	B2
Filing date	Mar 29, 2019
Priority date	Dec 21, 2018
Publication date	Apr 6, 2021
Grant date	Apr 6, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Described herein is a computer implemented method comprising accessing a document, generating a document vector in respect of the document, and generating a sentence vector for each sentence in the document. The method further comprises calculating a sentence similarity score for each sentence in the document which, for a given sentence, is calculated based on a similarity between the sentence vector for the given sentence and the document vector, and identifying one or more representative document sentences for inclusion in a document summary.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer implemented method comprising: accessing a document; generating a document vector in respect of the document; generating a sentence vector for each sentence in the document; calculating a sentence similarity score for each sentence in the document, the sentence similarity for a given sentence being calculated based on a similarity between the sentence vector for the given sentence and the document vector; identifying one or more representative document sentences for inclusion in a document summary, the one or more representative document sentences being identified based on their sentence similarity scores. 2. The computer implemented method of claim 1 , further comprising: generating a summary order in which the representative document sentences identified for inclusion in the summary should be presented, wherein the summary order is based on the order in which the identified sentences appear in the document. 3. The computer implemented method of claim 1 , wherein generating a sentence vector in respect of a given sentence comprises: identifying relevant words in the given sentence; retrieving, from a language model, word vectors in respect of each relevant word identified in the given sentence; and summing the retrieved word vectors to generate the sentence vector. 4. The computer implemented method of claim 1 , wherein generating a sentence vector in respect of a given sentence comprises: identifying relevant words in the given sentence; calculating a weighted word vector for each word identified in the given sentence; and summing the weighted word vectors to generate the sentence vector. 5. The computer implemented method of claim 4 , wherein calculating a weighted word vector for a given word comprises: retrieving, from a language model, a word vector in respect of the given word; applying a term frequency-inverse document frequency weighting to the retrieved word vector. 6. The computer implemented method of claim 1 , wherein prior to generating a sentence vector for each sentence in the document, the document is tokenized to identify the sentences in the document. 7. The computer implemented method of claim 1 , wherein generating a document vector in respect of the document comprises: accessing the document; tokenizing the document to identify document sentences and document words; processing the tokenized document to generate an initial document vector; and normalizing the initial vector to generate the document vector. 8. The computer implemented method of claim 7 , wherein processing the tokenized document to generate the initial document vector comprises: retrieving, from a language model, word vectors in respect of each relevant word identified in the document; and summing the retrieved word vectors to generate the initial document vector. 9. The computer implemented method of claim 7 , wherein processing the tokenized document to generate the initial document vector comprises: calculating a weighted word vector for each relevant word identified in the document; and summing the weighted word vectors to generate the initial document vector. 10. The computer implemented method of claim 9 , wherein calculating a weighted word vector for a given word comprises: retrieving, from a language model, a word vector in respect of the given word; retrieving, from a frequency model, a training set frequency in respect of the given word, the training set frequency in respect of the given word being the frequency of the given word in a training set of data; calculating the frequency of the given word in the document; and applying a term frequency-inverse document frequency weighting to the retrieved word vector, the term frequency being the training set frequency in respect of the given word and the document frequency being the frequency of the given word in the document. 11. A computer system comprising: a processor; a communication interface; and a non-transitory computer-readable storage medium storing sequences of instructions, which when executed by the processor, cause the processor to implement a method comprising: accessing a document; generating a document vector in respect of the document; generating a sentence vector for each sentence in the document; calculating a sentence similarity score for each sentence in the document, the sentence similarity for a given sentence being calculated based on a similarity between the sentence vector for the given sentence and the document vector; identifying one or more representative document sentences for inclusion in the summary, the one or more representative document sentences being identified based on their sentence similarity scores. 12. The computer system of claim 11 , wherein the sequences of instructions further cause the processor to implement the method further comprises: generating a summary order in which the representative document sentences identified for inclusion in the summary should be presented, wherein the summary order is based on the order in which the identified sentences appear in the document. 13. The computer system of claim 11 , wherein generating a sentence vector in respect of a given sentence comprises: identifying relevant words in the given sentence; retrieving, from a language model, word vectors in respect of each relevant word identified in the given sentence; and summing the retrieved word vectors to generate the sentence vector. 14. The computer system of claim 11 , wherein generating a sentence vector in respect of a given sentence comprises: identifying relevant words in the given sentence; calculating a weighted word vector for each word identified in the given sentence; and summing the weighted word vectors to generate the sentence vector. 15. The computer system of claim 14 , wherein calculating a weighted word vector for a given word comprises: retrieving, from a language model, a word vector in respect of the given word; applying a term frequency-inverse document frequency weighting to the retrieved word vector. 16. The computer system of claim 11 , wherein prior to generating a sentence vector for each sentence in the document, the document is tokenized to identify the sentences in the document. 17. The computer system of claim 11 , wherein generating a document vector in respect of the document comprises: accessing the document; tokenizing the document to identify document sentences and document words; processing the tokenized document to generate an initial document vector; and normalizing the initial vector to generate the document vector. 18. The computer system of claim 17 , wherein processing the tokenized document to generate the initial document vector comprises: retrieving, from a language model, word vectors in respect of each relevant word identified in the document; and summing the retrieved word vectors to generate the initial document vector. 19. The computer system of claim 17 , wherein processing the tokenized document to generate the initial document vector comprises: calculating a weighted word vector for each relevant word identified in the document; and summing the weighted word vectors to generate the initial document vector. 20. The computer system of claim 19 , wherein calculating a weighted word vector for a given word comprises: retrieving, from a language model, a word vector in respect of the given word; retrieving, from a frequency model, a training set frequency in respect

Assignees

Inventors

Classifications

G06F40/284
Lexical analysis, e.g. tokenisation or collocates · CPC title
H04L41/5074Primary
Handling of user complaints or trouble tickets · CPC title
G06F16/285Primary
Clustering or classification · CPC title
G06F40/30
Semantic analysis · CPC title
G06F16/93
Document management systems · CPC title

Patent family

Related publications grouped by family.

View patent family 69528277

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10970314B2 cover?: Described herein is a computer implemented method comprising accessing a document, generating a document vector in respect of the document, and generating a sentence vector for each sentence in the document. The method further comprises calculating a sentence similarity score for each sentence in the document which, for a given sentence, is calculated based on a similarity between the sentence …
Who is the assignee on this patent?: Atlassian Pty Ltd, Atlassian Inc
What technology area does this patent fall under?: Primary CPC classification H04L41/5074. Mapped technology areas include Electricity.
When was this patent published?: Publication date Tue Apr 06 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).