Machine learning collaboration techniques
US-2024420212-A1 · Dec 19, 2024 · US
US2019066663A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2019066663-A1 |
| Application number | US-201715684042-A |
| Country | US |
| Kind code | A1 |
| Filing date | Aug 23, 2017 |
| Priority date | Aug 23, 2017 |
| Publication date | Feb 28, 2019 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A recurrent neural network (RNN) is trained to identify split positions in long content, wherein each split position is a position at which the theme of the long content changes. Each sentence in the long content is converted to a vector that corresponds to the meaning of the sentence. The sentence vectors are used as inputs to the RNN. The high-probability split points determined by the RNN may be combined with contextual cues to determine the actual split point to use. The split points are used to generate thematic segments of the long content. The multiple thematic segments may be presented to a user along with a topic label for each thematic segment. Each topic label may be generated based on the words contained in the corresponding thematic segment.
Opening claim text (preview).
What is claimed is: 1 . A system comprising: a memory that stores instructions; one or more processors configured by the instructions to perform operations comprising: accessing a plurality of sentences; generating a plurality of sentence vectors, each sentence vector of the plurality of sentence vectors corresponding to a respective sentence of the plurality of sentences; providing a subset of the plurality of sentence vectors as an input to a recurrent neural network (RNN); based on an output of the RNN responsive to the input, determining that a subset of the plurality of sentences relate to a first topic; and providing an output comprising the subset of the plurality of sentences related to the first topic. 2 . The system of claim 1 , wherein the generating of each sentence vector of the plurality of sentence vectors comprises: accessing a plurality of word vectors, each word vector of the plurality of word vectors corresponding to a respective word of the sentence corresponding to the sentence vector; and averaging the plurality of word vectors to generate the sentence vector. 3 . The system of claim 1 , wherein the operations further comprise: accessing an audio file; and generating the plurality of sentences from the audio file using speech-to-text conversion. 4 . The system of claim 1 , wherein: the operations further comprise: providing a second subset of the plurality of sentence vectors as a second input to the RNN; and the determining that the subset of the plurality of sentences relate to the first topic is further based on a second output of the RNN responsive to the second input. 5 . The system of claim 4 , wherein: the second subset of the plurality of sentence vectors has a same number of sentence vectors as the first subset of the plurality of sentence vectors; the second subset of the plurality of sentence vectors has at least one vector in common with the first subset of the plurality of sentence vectors; and the second subset of the plurality of sentence vectors has at least one vector different from each vector in the first subset of the plurality of sentence vectors. 6 . The system of claim 1 , wherein: the determining that the subset of the plurality of sentences relate to the first topic comprises: comparing each value of a plurality of output values from the RNN to a predetermined threshold, each output value corresponding to a possible split position indicating a split between the first topic and a second topic; and the determining that the subset of the plurality of sentences relate to the first topic is based on results of the comparisons. 7 . The system of claim 1 , wherein: the plurality of sentences are embedded in a file that includes a paragraph change indicator at a position within the plurality of sentences; and the determining that the subset of the plurality of sentences relate to the first topic is based on the position of the paragraph change indicator. 8 . The system of claim 1 , wherein: the plurality of sentences are embedded in a file that includes a header indicator at a position within the plurality of sentences; and the determining that the subset of the plurality of sentences relate to the first topic is based on the position of the header indicator. 9 . The system of claim 1 , wherein the operations further comprise: determining a set of words comprised by the subset of the plurality of sentences; and generating a name of the first topic based on the set of words. 10 . The system of claim 1 , wherein: the operations further comprise: accessing a uniform resource locator (URL); accessing a media file using the URL; generating the plurality of sentences by using speech-to-text conversion on the media file; and identifying a second subset of the plurality of sentences related to a second topic, using the RNN; the output comprising the subset of the plurality of sentences related to the first topic is a first media file; and the operations further comprise: generating a first name for the first topic; generating a second media file comprising the second subset of the plurality of sentences related to the second topic; generating a second name for the second topic; and providing a user interface that includes the first name, a link to the first media file, the second name, and a link to the second media file. 11 . A method comprising: accessing, by one or more processors, a plurality of sentences; generating, by the one or more processors, a plurality of sentence vectors, each sentence vector of the plurality of sentence vectors corresponding to a respective sentence of the plurality of sentences; providing, by the one or more processors, a subset of the plurality of sentence vectors as an input to a recurrent neural network (RNN); based on an output of the RNN responsive to the input, determining, by the one or more processors, that a subset of the plurality of sentences relate to a first topic; and providing, by the one or more processors an output comprising the subset of the plurality of sentences related to the first topic. 12 . The method of claim 11 , wherein the generating of each sentence vector of the plurality of sentence vectors comprises: accessing a plurality of word vectors, each word vector of the plurality of word vectors corresponding to a respective word of the sentence corresponding to the sentence vector; and averaging the plurality of word vectors to generate the sentence vector. 13 . The method of claim 11 , further comprising: accessing an audio file; and generating the plurality of sentences from the audio file using speech-to-text conversion. 14 . The method of claim 11 , wherein: the method further comprises: providing a second subset of the plurality of sentence vectors as a second input to the RNN; and the determining that the subset of the plurality of sentences relate to the first topic is further based on a second output of the RNN responsive to the second input. 15 . The method of claim 14 , wherein: the second subset of the plurality of sentence vectors has a same number of sentence vectors as the first subset of the plurality of sentence vectors; the second subset of the plurality of sentence vectors has at least one vector in common with the first subset of the plurality of sentence vectors; and the second subset of the plurality of sentence vectors has at least one vector different from each vector in the first subset of the plurality of sentence vectors. 16 . The method of claim 11 , wherein: the determining that the subset of the plurality of sentences relate to the first topic comprises: comparing each value of a plurality of output values from the RNN to a predetermined threshold, each output value corresponding to a possible split position indicating a split between the first topic and a second topic; and the determining that the subset of the plurality of sentences relate to the first topic is based on results of the comparisons. 17 . The method of claim 11 , wherein: the plurality of sentences are embedded in a file that includes a paragraph change indicator at a position within the plurality of sentences; and the determining that the subset of the plurality of sentences relate to the first topic is based on the position of the paragraph change indicator. 18 . The method of claim 11 , wherein: the plurality of sentences are embedded in a file that includes a header indicator at a position within the plurality of sentences; and
Recurrent networks, e.g. Hopfield networks · CPC title
Semantic analysis · CPC title
Speech to text systems (G10L15/08 takes precedence) · CPC title
Heading extraction; Automatic titling; Numbering · CPC title
Learning methods · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.