Document search system, document search method, program, and non-transitory computer readable storage medium

US11789953B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11789953-B2
Application numberUS-201916979197-A
CountryUS
Kind codeB2
Filing dateMar 13, 2019
Priority dateMar 23, 2018
Publication dateOct 17, 2023
Grant dateOct 17, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A highly accurate document search, particularly a search for a document relating to intellectual property, is achieved with an easy input method. A document search system includes a processing portion. The processing portion has a function of extracting a keyword included in text data, a function of extracting a related term of the keyword from words included in a plurality of pieces of first reference text analysis data, a function of giving a weight to each of the keyword and the related term, a function of giving a score to each of a plurality of pieces of second reference text analysis data on the basis of the weight, a function of ranking the plurality of pieces of second reference text analysis data on the basis of the score to generate ranking data, and a function of outputting the ranking data.

First claim

Opening claim text (preview).

The invention claimed is: 1. A document search system comprising: a processing portion, wherein the processing portion is configured to extract a keyword included in text data, to extract a related term of the keyword from words included in a plurality of pieces of first reference text analysis data, to give a weight to each of the keyword and the related term, to give a score to each of a plurality of pieces of second reference text analysis data on the basis of the weight, to rank the plurality of pieces of second reference text analysis data on the basis of the score to generate a ranking data, and to output the ranking data, wherein the related term is extracted from the words included in the plurality of pieces of first reference text analysis data, on the basis of a similarity degree or a proximity of distance between distributed representation vectors of the words and a distributed representation vector of the keyword, wherein the weight of the keyword is a value based on an inverse document frequency of the keyword in the plurality of pieces of first reference text analysis data or the plurality of pieces of second reference text analysis data, wherein the weight of the related term is a product of the weight of the keyword by a value based on a similarity degree or a distance between a distributed representation vector of the related term and a distributed representation vector of the keyword, and wherein the processing portion is configured to give a compiled weight to each of the keyword and the related term, to give a score to each of the plurality of pieces of second reference text analysis data on the basis of the compiled weight. 2. The document search system according to claim 1 , wherein the score is given to the second reference text analysis data including a word matching the keyword or the related term. 3. The document search system according to claim 1 , wherein the plurality of pieces of first reference text analysis data is the same as the plurality of pieces of second reference text analysis data. 4. The document search system according to claim 1 , wherein the related term is extracted using a distributed representation vector obtained by machine learning performed on distributed representation of the words included in the plurality of pieces of first reference text analysis data. 5. The document search system according to claim 1 , wherein each of the distributed representation vectors of the words is a vector generated with use of a neural network. 6. The document search system according to claim 1 , wherein extraction of the keyword included in text data comprises a morphological analysis on the text data to generate analysis data and extraction of the keyword from the analysis data, and wherein the keyword is extracted from words included in the analysis data on the basis of an inverse document frequency in the plurality of pieces of first reference text analysis data or the plurality of pieces of second reference text analysis data. 7. The document search system according to claim 1 , wherein the weight is changeable by a user. 8. The document search system according to claim 1 , wherein the first reference text analysis data is data generated by performing morphological analysis on first reference text data, and wherein the second reference text analysis data is data generated by performing morphological analysis on second reference text data. 9. The document search system according to claim 1 , further comprising: an electronic device; and a server, wherein the electronic device comprises a first communication portion, wherein the server comprises the processing portion and a second communication portion, wherein the first communication portion is configured to supply the text data to the server through one or both of wire communication and wireless communication, wherein the processing portion is configured to supply the ranking data to the second communication portion, and wherein the second communication portion is configured to supply the ranking data to the electronic device through one or both of wire communication and wireless communication. 10. The document search system according to claim 1 , wherein the processing portion comprises a transistor, and wherein the transistor comprises a metal oxide in a channel formation region. 11. The document search system according to claim 1 , wherein the processing portion comprises a transistor, and wherein the transistor comprises silicon in a channel formation region. 12. The document search system according to claim 1 , wherein the related term is configured to be included by synonyms of the keyword. 13. The document search system according to claim 1 , wherein the processing portion is configured to output the weight of the keyword and the weight of the related term. 14. A document search method comprising the steps of: extracting a keyword included in text data; extracting a related term of the keyword from words included in a plurality of pieces of first reference text analysis data; giving a weight to each of the keyword and the related term; giving a score to each of a plurality of pieces of second reference text analysis data on the basis of the weight; ranking the plurality of pieces of second reference text analysis data on the basis of the score to generate a ranking data; outputting the ranking data; giving a compiled weight to each of the keyword and the related term; and giving a score to each of the plurality of pieces of second reference text analysis data on the basis of the compiled weight, wherein the related term is extracted from the words included in the plurality of pieces of first reference text analysis data, on the basis of a similarity degree or a proximity of distance between distributed representation vectors of the words and a distributed representation vector of the keyword, wherein the weight of the keyword is a value based on an inverse document frequency of the keyword in the plurality of pieces of first reference text analysis data or the plurality of pieces of second reference text analysis data, and wherein the weight of the related term is a product of the weight of the keyword by a value based on a similarity degree or a distance between a distributed representation vector of the related term and a distributed representation vector of the keyword. 15. The document search method according to claim 14 , wherein the score is given to the second reference text analysis data including a word matching the keyword or the related term. 16. The document search method according to claim 14 , wherein the plurality of pieces of first reference text analysis data is the same as the plurality of pieces of second reference text analysis data. 17. The document search method according to claim 14 , wherein the related term is extracted using a distributed representation vector obtained by machine learning performed on distributed representation of the words included in the plurality of pieces of first reference text analysis data. 18. The document search method according to claim 14 , wherein each of the distributed representation vectors of the words is a vector generated with use of a neural network. 19. The document search method according to claim 14 , wherein the step of extracting the keyword included in the text data comprises the steps of: performing morphological analysis on the text data to generate analysis data; and extracting the keyword from wo

Assignees

Inventors

Classifications

  • Supervised learning · CPC title

  • Feedforward networks · CPC title

  • Learning methods · CPC title

  • using electronic means · CPC title

  • G06F40/279Primary

    Recognition of textual entities · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11789953B2 cover?
A highly accurate document search, particularly a search for a document relating to intellectual property, is achieved with an easy input method. A document search system includes a processing portion. The processing portion has a function of extracting a keyword included in text data, a function of extracting a related term of the keyword from words included in a plurality of pieces of first r…
Who is the assignee on this patent?
Semiconductor Energy Lab
What technology area does this patent fall under?
Primary CPC classification G06F40/279. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 17 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).