Sorting documents according to comprehensibility scores determined for the documents
US-2024119078-A1 · Apr 11, 2024 · US
US2025068669A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2025068669-A1 |
| Application number | US-202418945924-A |
| Country | US |
| Kind code | A1 |
| Filing date | Nov 13, 2024 |
| Priority date | May 24, 2019 |
| Publication date | Feb 27, 2025 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A document search system that enables efficient document search regardless of the ability of a user is achieved. Document search is performed using a document search system in which database document data is stored. After first document data and second document data are input to the document search system, the document search system extracts a plurality of terms from the first document data. The extraction of the terms is performed using morphological analysis, for example. Next, the extracted terms are weighted on the basis of the second document data. For example, texts included in a document represented by the second document data are classified into first and second texts. Among the terms extracted from the first document data, the weight of the term included in the first text is set larger than the weights of the other terms. The classification of the texts can be performed in accordance with a rule basis or using machine learning. After that, the similarity of the database document data to the first document data is calculated on the basis of the weighted term.
Opening claim text (preview).
1 . A semiconductor devise having instructions stored thereon which, when executed by one or more processors, cause the one or more processors to perform operations for searching documents, the operations comprising: receiving first document data and second document data; extracting a plurality of terms from the first document data; weighting at least one of the plurality of terms on a basis of the second document data; calculating a similarity of the database document data to the first document data on a basis of the at least one weighted term; and outputting the calculated similarity, wherein, after the plurality of terms are extracted, texts included in a document represented by the second document data are classified using machine learning, wherein the first document data represents a scope of claims of a patent application, wherein the second document data represents a written opinion against a reason for refusal of the patent application, and wherein the second document data represents a document which includes content described in a document represented by the first document data. 2 . The semiconductor devise according to claim 1 , further the operations comprising: classifying texts included in the document represented by the second document data into a first text and a second text, and setting a weight of the term included in the first text larger than a weight of the term not included in the first text among the terms extracted from the first document data. 3 . The semiconductor devise according to claim 2 , further the operations comprising: performing machine learning, and performing the classification of texts on the basis of a learning result of the machine learning. 4 . The semiconductor devise according to claim 3 , further the operations comprising: inputting first learning document data; and performing the machine learning so that output data becomes closer to second learning document data, wherein the first learning document data is the same kind of document data as the second document data, and wherein the second learning document data is document data obtained by labeling the first learning document data. 5 . The semiconductor devise according to claim 1 , further the operations comprising: extracting the plurality of terms using morphological analysis.
Morphological analysis · CPC title
Document management systems · CPC title
Machine learning · CPC title
Selection or weighting of terms from queries, including natural language queries · CPC title
Parsing · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.