What technology area does this patent fall under?

Primary CPC classification G06F16/90332. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Dec 27 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

System and method for analysis of structured and unstructured data

US11537662B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11537662-B2
Application number	US-202017100019-A
Country	US
Kind code	B2
Filing date	Nov 20, 2020
Priority date	Oct 13, 2017
Publication date	Dec 27, 2022
Grant date	Dec 27, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The invention relates to computer-implemented systems and methods for analyzing and standardizing various types of input data such as structured data, semi-structured data, unstructured data, and images and voice. Embodiments of the systems and the methods further provide for generating responses to specific questions based on the standardized input data.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for analyzing at least one of structured and unstructured data, the method comprising: identifying at least one question and at least one input file to be analyzed, wherein the at least one input file comprises at least one of: text, an image, an audio file, a video file, a table, and a database; and applying an artificial intelligence process to the at least one input file, the artificial intelligence process comprising the steps of: generating, for the at least one input file, a converted file in a data format that is standardized for a plurality of input file types and that includes at least one element; wherein the at least one element is associated with an element identifier and an element type, and is stored in a non-hierarchical relationship format; applying a specific ontology to the converted file to perform semantic annotation to the converted file; generating, based on the semantic annotation, at least one expression, the at least one expression comprising one or more of specific words, relationships between specific words, and word patterns that identify specific content in a converted file, wherein the expression comprises an expression string in a domain-specific language; reading, via a machine review portion of the artificial intelligence process, the at least one expression; and applying, via the machine review portion of the artificial intelligence process, the at least one expression to the converted file to automatically generate a response to the question; and applying the answer to the at least one question as feedback to the artificial intelligence process to improve the accuracy of the artificial intelligence process. 2. The method of claim 1 , wherein the data format represents extracted data from the at least one input file and corresponding metadata. 3. The method of claim 1 , wherein the at least one element is stored in an annotation format where the at least one element is stored separately from the at least one input file. 4. The method of claim 1 , wherein the at least one expression specifies one or more words, a relationship between the one or more words and at least one pattern that identifies document features. 5. The method of claim 1 , wherein the at least one expression represents one or more features to be utilized and one or more patterns of the features to be identified. 6. The method of claim 1 , wherein the at least one expression is an input to an intelligent domain engine (IDE) that leverages natural language processing to systematically classify and analyze a corpus of documents. 7. The method of claim 6 , wherein the intelligent domain engine further comprises a user interface to enable a user to modify the at least one expression. 8. The method of claim 1 , wherein the response to the question is communicated via a user interface. 9. The method of claim 8 , wherein the user interface displays support and justification associated with the response. 10. A system for analyzing at least one of structured and unstructured data, the system comprising: a scanner configured to receive at least one input file to be analyzed, wherein the at least one input file comprises at least one of: text, an image, an audio file, a video file, a table, and a database; and a server, wherein the server is configured to: identify at least one question and the scanned at least one input file; apply an artificial intelligence process to the at least one input file; generate, for the at least one input file, a converted file in a data format that is standardized for a plurality of input file types and that includes at least one element; wherein the at least one element is associated with an element identifier and an element type and is stored in a non-hierarchical relationship format; apply a specific ontology to the converted file to resolve entities and perform semantic annotation, the entity resolution comprising one or more determinations of whether entities detected in the converted file refer to one or more real-world entities, and the semantic annotation comprising relating one or more phrases in the converted file to one or more concepts formally defined in the specific ontology; generate at least one expression, the at least one expression comprising one or more of specific words, relationships between specific words, and word patterns that identify specific content in a converted file, wherein the expression comprises an expression string in a domain-specific language; read, via a machine review portion of the artificial intelligence process, the at least one expression; and apply, via the machine review portion of the artificial intelligence process, the at least one expression to the converted file to automatically generate a response to the question; and apply the answer to the at least one question as feedback to the artificial intelligence process to improve the accuracy of the artificial intelligence process. 11. The system of claim 10 , wherein the data format represents extracted data from the at least one input file and corresponding metadata. 12. The system of claim 10 , wherein the at least one element is stored in an annotation format where the at least one element is stored separately from the at least one input file. 13. The system of claim 10 , wherein the at least one expression specifies one or more words, a relationship between the one or more words and at least one pattern that identifies document features. 14. The system of claim 10 , wherein the at least one expression represents one or more features to be utilized and one or more patterns of the features to be identified. 15. The system of claim 10 , wherein the at least one expression is an input to an intelligent domain engine (IDE) that leverages natural language processing to systematically classify and analyze a corpus of documents. 16. The system of claim 15 , wherein the intelligent domain engine further comprises a user interface to enable a user to modify the at least one expression. 17. The system of claim 10 , wherein the response to the question is communicated via a user interface. 18. The system of claim 17 , wherein the user interface displays support and justification associated with the response. 19. A system for analyzing at least one of structured and unstructured data, the system comprising: a server, wherein the server is configured to: identify at least one question and at least one input file to be analyzed, wherein the at least one input file comprises at least one of: text, an image, an audio file, a video file, a table, and a database; apply an artificial intelligence process to the at least one input file; generate, for the at least one input file, a converted file in a data format that is standardized for a plurality of input file types and that includes at least one element; wherein the at least one element is associated with an element identifier and an element type and is stored in a non-hierarchical relationship format; apply a specific ontology to the converted file to resolve entities and perform semantic annotation, the entity resolution comprising one or more determinations of whether entities detected in the converted file refer to one or more real-world entities, and the semantic annotation comprising relating one or more phrases in the converted file to one or more concepts formally defined in the specific ontology; generate, by an artificial intelligence operator, at least one expression, the at least one expression compris

Assignees

Kpmg Llp

Inventors

Classifications

G06V30/413
Classification of content, e.g. text, photographs or tables · CPC title
G06F40/14
Tree-structured documents (parsing G06F40/205; validation G06F40/226) · CPC title
G06F16/116
Details of conversion of file system types or formats · CPC title
G06V30/416
Extracting the logical structure, e.g. chapters, sections or page numbers; Identifying elements of the document, e.g. authors · CPC title
G06F40/169
Annotation, e.g. comment data or footnotes · CPC title

Patent family

Related publications grouped by family.

View patent family 66097474

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11537662B2 cover?: The invention relates to computer-implemented systems and methods for analyzing and standardizing various types of input data such as structured data, semi-structured data, unstructured data, and images and voice. Embodiments of the systems and the methods further provide for generating responses to specific questions based on the standardized input data.
Who is the assignee on this patent?: Kpmg Llp
What technology area does this patent fall under?: Primary CPC classification G06F16/90332. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Dec 27 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Domain-specific stopword removal from unstructured computer text using a neural network

Systems and methods for structuring data from unstructured electronic data files

Bootstrapping Knowledge Acquisition from a Limited Knowledge Domain

Evaluating Temporal Relevance in Question Answering

Product recommendation with ontology-linked product review

Question answering from structured and unstructured data sources

Frequently asked questions