What technology area does this patent fall under?

Primary CPC classification G06F16/90332. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Nov 24 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 11 related publications on this page (citations in our corpus or others sharing the same primary CPC).

System and method for analysis of structured and unstructured data

US10846341B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10846341-B2
Application number	US-201816159088-A
Country	US
Kind code	B2
Filing date	Oct 12, 2018
Priority date	Oct 13, 2017
Publication date	Nov 24, 2020
Grant date	Nov 24, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The invention relates to computer-implemented systems and methods for analyzing and standardizing various types of input data such as structured data, semi-structured data, unstructured data, and images and voice. Embodiments of the systems and the methods further provide for generating responses to specific questions based on the standardized input data.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for analyzing at least one of structured and unstructured data, the method comprising: receiving at least one specific question and at least one input file to be analyzed, wherein the at least one input file comprises at least one of: text, an image, an audio file, a video file, a table, and a database; and applying an artificial intelligence process to the at least one input file, the artificial intelligence process comprising the steps of: generating, for the at least one input file, a converted file in a data format that is standardized for a plurality of input file types and that includes at least one element; generating the at least one element, wherein the at least one element includes an element identifier and an element type, and is stored in a non-hierarchical relationship format to other elements; generating at least one expression, wherein the expression comprises an expression string that is in a domain-specific language; reading, via a machine review portion of the artificial intelligence process, the at least one expression; applying, via the machine review portion of the artificial intelligence process, the at least one expression to the converted file to generate an output file having an answer to the specific question; and applying the answer to the specific question as feedback to the artificial intelligence process to improve the accuracy of the artificial intelligence process. 2. The method of claim 1 , wherein the converted file is configured to interface with a dynamic, interpreted language. 3. The method of claim 1 , wherein the converted file is configured to interface with Python. 4. The method of claim 1 , wherein the converted file is (i) implemented in Python and includes computer-object representations as Python objects and (ii) serialized as JSON for inter-process communication. 5. The method of claim 1 , wherein the converted file is configured for use with JSON, Swagger (YAML), and RESTful. 6. The method of claim 1 , wherein the converted file further includes a name of the document, a file path for the document, a file type of the document, and a binary representation of the document. 7. The method of claim 1 , wherein the at least one element further includes at least one attribute, wherein the at least one attribute comprises a key-value pair. 8. The method of claim 1 , wherein the expression is configured to interface with the format of the converted file. 9. The method of claim 1 , wherein the at least one element is generated and stored in a stand-off annotation format in an annotated file, wherein the at least one expression is applied to the annotated file to generate the output file. 10. The method of claim 9 , wherein the expression string (i) specifies at least one of a programmatic logical operation and a pattern to search for in the annotated file and (ii) incorporates subject matter expertise for a particular question. 11. The method of claim 1 , wherein the domain-specific language is stored in a comma-separated-value file. 12. A system for analyzing at least one of structured and unstructured data, the system comprising: a scanner, wherein the scanner is configured to receive at least one input file to be analyzed, wherein the at least one input file comprises at least one of: text, an image, an audio file, a video file, a table, and a database; and a server, wherein the server is configured to: receive at least one specific question and the scanned at least one input file; apply an artificial intelligence process to the at least one input file; generate, for the at least one input file, a converted file in a data format that is standardized for a plurality of input file types and that includes at least one element; generate the at least one element, wherein the at least one element includes an element identifier and an element type and is stored in a non-hierarchical relationship format to other elements; generate at least one expression, wherein the expression comprises an expression string that is in a domain-specific language; read, via a machine review portion of the artificial intelligence process, the at least one expression; apply, via the machine review portion of the artificial intelligence process, the at least one expression to the converted file to generate an output file having an answer to the specific question; and apply the answer to the specific question as feedback to the artificial intelligence process to improve the accuracy of the artificial intelligence process. 13. The system of claim 12 , wherein the server includes an intelligent domain engine (IDE), wherein the IDE is configured to: receive the at least one expression, apply the at least one expression to the at least one input file, and output the answer to the specific question based on the applying. 14. The system of claim 13 , wherein the IDE incorporates at least one of natural language processing, machine learning, annotation components, and manually-encoded expressions to classify and analyze the at least one input file. 15. The system of claim 12 , wherein the server is further configured to: extract original source data and metadata from the at least one input file, store the extracted original source data in the converted file, generate the at least one element based on a conversion of the extracted metadata, store the generated at least one element in the converted file. 16. The system of claim 15 , wherein the metadata is at least one of author information, page information, paragraph information, and font information. 17. The system of claim 15 , wherein the extracted metadata is converted with a format-specific parser. 18. The system of claim 12 , wherein the server is further configured to perform at least one of entity resolution and semantic annotation on the at least one input file. 19. The system of claim 18 , wherein (i) the entity resolution determines a match between data associated with the at least one input file and data associated with at least one ontology and (ii) the semantic annotation connects the data associated with the at least one input file with the data associated with at least one ontology. 20. A system for analyzing at least one of structured and unstructured data, the system comprising: a server, wherein the server is configured to: receive at least one specific question and at least one input file to be analyzed, wherein the at least one input file comprises at least one of: text, an image, an audio file, a video file, a table, and a database; apply an artificial intelligence process to the at least one input file; generate, for the at least one input file, a converted file in a data format that is standardized for a plurality of input file types and that includes at least one element; generate the at least one element, wherein the at least one element includes an element identifier and an element type and is stored in a non-hierarchical relationship format to other elements; generate, by a artificial intelligence operator, at least one expression, wherein the expression comprises an expression string that is in a domain-specific language; read, via a machine review portion of the artificial intelligence process, the at least one expression; apply, via the machine review portion of the artificial intelligence process, the at least one expression to the converted file to generate an output file having an answer to the specific question; and apply the answer to

Assignees

Kpmg Llp

Inventors

Classifications

G06N7/01
Probabilistic graphical models, e.g. probabilistic networks · CPC title
G06N3/091
Active learning · CPC title
G06N3/09
Supervised learning · CPC title
G06V30/416
Extracting the logical structure, e.g. chapters, sections or page numbers; Identifying elements of the document, e.g. authors · CPC title
G06V30/413
Classification of content, e.g. text, photographs or tables · CPC title

Patent family

Related publications grouped by family.

View patent family 66097474

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10846341B2 cover?: The invention relates to computer-implemented systems and methods for analyzing and standardizing various types of input data such as structured data, semi-structured data, unstructured data, and images and voice. Embodiments of the systems and the methods further provide for generating responses to specific questions based on the standardized input data.
Who is the assignee on this patent?: Kpmg Llp
What technology area does this patent fall under?: Primary CPC classification G06F16/90332. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Nov 24 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 11 related publications on this page (citations in our corpus or others sharing the same primary CPC).