Enhanced document input parsing

US9418066B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9418066-B2
Application numberUS-201313928642-A
CountryUS
Kind codeB2
Filing dateJun 27, 2013
Priority dateJun 27, 2013
Publication dateAug 16, 2016
Grant dateAug 16, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An approach is provided for an information handling system that includes a processor and a memory to analyze documents. In the approach, an electronic document is received with the document including content, such as text, and revision metadata that is associated with the content. The revision metadata is analyzed and the approach identifies a confidence level based on the analysis. The confidence level is associated with the electronic document content. The confidence level can then be utilized by a Question and Answer (QA) system.

First claim

Opening claim text (preview).

What is claimed is: 1. An information handling system comprising: one or more processors; a memory coupled to at least one of the processors; a network adapter that connects the information handling system to a computer network; and a set of instructions stored in the memory and executed by at least one of the processors to analyze documents, wherein the set of instructions perform actions of: receiving an electronic document that includes a content and a revision metadata associated with the content, wherein the revision metadata has been added to the electronic document in response to one or more revision authors reviewing the content; selecting at least one of a plurality of revisions in the revision metadata; identifying at least one of the one or more revision authors associated with the selected revision; collecting expertise-related data pertaining to the identified revision author, wherein the collecting is obtained from a plurality of network sources; identifying a topic area associated with the electronic document; determining a revision author expertise level associated with the identified revision author based on the collected expertise-related data and an expertise of the identified revision author in the identified topic area; identifying at least one endorser of the identified revision author; collecting endorsement data pertaining to the endorser; determining an endorser expertise based on the collected endorsement data and the identified topic area; adjusting the revision author expertise level based on the endorser expertise; identifying a confidence level of the electronic document content based on the adjusted revision author expertise level; and associating the confidence level with the electronic document content. 2. The information handling system of claim 1 wherein the actions further comprise: parsing the electronic document into a plurality of sections; and analyzing each of the plurality of sections based on the revision metadata associated with the section of the electronic document. 3. The information handling system of claim 1 wherein the actions further comprise: identifying a revision type associated with the selected revision, wherein the confidence level is based on the revision type. 4. The information handling system of claim 1 wherein the actions further comprise: identifying at least one document author associated with the electronic document; collecting different expertise-related data pertaining to the identified document author from a plurality of different network sources; and determining a document author expertise level associated with the identified document author based on the collected different expertise-related data, wherein the confidence level is based on the document author expertise level. 5. A computer program product stored in a non-transitory computer readable storage medium, comprising computer instructions that, when executed by an information handling system, causes the information handling system to analyze documents by performing actions comprising: receiving an electronic document that includes a content and a revision metadata associated with the content, wherein the revision metadata has been added to the electronic document in response to one or more revision authors reviewing the content; selecting at least one of a plurality of revisions in the revision metadata; identifying at least one of the one or more revision authors associated with the selected revision; collecting expertise-related data pertaining to the identified revision author, wherein the collecting is obtained from a plurality of network sources; identifying a topic area associated with the electronic document; determining a revision author expertise level associated with the identified revision author based on the collected expertise-related data and an expertise of the identified revision author in the identified topic area; identifying at least one endorser of the identified revision author; collecting endorsement data pertaining to the endorser; determining an endorser expertise based on the collected endorsement data and the identified topic area; adjusting the revision author expertise level based on the endorser expertise; identifying a confidence level of the electronic document content based on the adjusted revision author expertise level; and associating the confidence level with the electronic document content. 6. The computer program product of claim 5 wherein the actions further comprise: parsing the electronic document into a plurality of sections; and analyzing each of the plurality of sections based on the revision metadata associated with the section of the electronic document. 7. The computer program product of claim 5 wherein the actions further comprise: identifying a revision type associated with the selected revision, wherein the confidence level is based on the revision type. 8. The computer program product of claim 5 wherein the actions further comprise: identifying at least one document author associated with the electronic document; collecting different expertise-related data pertaining to the identified document author from a plurality of different network sources; and determining a document author expertise level associated with the identified document author based on the collected different expertise-related data, wherein the confidence level is based on the document author expertise level.

Assignees

Inventors

Classifications

  • Probabilistic graphical models, e.g. probabilistic networks · CPC title

  • Editing, e.g. inserting or deleting · CPC title

  • Parsing · CPC title

  • G06F16/93Primary

    Document management systems · CPC title

  • Natural language query formulation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9418066B2 cover?
An approach is provided for an information handling system that includes a processor and a memory to analyze documents. In the approach, an electronic document is received with the document including content, such as text, and revision metadata that is associated with the content. The revision metadata is analyzed and the approach identifies a confidence level based on the analysis. The confide…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/93. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 16 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).