What technology area does this patent fall under?

Primary CPC classification G06F40/295. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 08 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Method and system for extracting entity information from target data

US11270073B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11270073-B2
Application number	US-201816233736-A
Country	US
Kind code	B2
Filing date	Dec 27, 2018
Priority date	Dec 30, 2017
Publication date	Mar 8, 2022
Grant date	Mar 8, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed is a method and a system for extracting entity information from target data. The method comprises: providing the target data; refining the target data to obtain at least one base entity information having a plurality of base entity units using an algorithm, wherein the algorithm is based on a predefined syntax; generating a plurality of strings for each of the base entity information, wherein the plurality of strings comprises at least one base entity unit among the plurality of base entity units; sorting the plurality of strings in a decreasing order of length of the plurality of strings; identifying an entity type of the plurality of strings, based on an ontology, by processing the plurality of strings sequentially; assigning labels to the plurality of strings based on the entity type; and mapping the labelled plurality of strings to a predefined signature to obtain the entity information.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of extracting entity information from target data, wherein the method comprises: providing the target data; refining the target data to obtain a plurality of base entity units, wherein the target data is refined using an algorithm; generating a plurality of strings based on the plurality of base entity units, wherein the plurality of strings comprises one or more base entity unit among the plurality of base entity units; sorting the plurality of strings in a decreasing order of length; processing the sorted plurality of strings sequentially to identify one or more entity types and establish links between the one or more base entity units of the plurality of base entity units, wherein the entity type refers to a specific field to which the base entity unit is associated with, and wherein the entity type and the established units are identified based on an ontology; assigning labels to the one or more entity types; mapping the labelled one or more entity types to a predefined signature, wherein the predefined signature relates to a predefined arrangement of the entity types; processing the plurality of strings with labelled entity type to identify a pattern similar to the predefined signature; and extracting entity information based on the operation of the predefined signature and the plurality of strings. 2. The method of claim 1 , wherein the method further comprises classifying the obtained entity information based on the ontology. 3. The method of claim 1 , wherein the length of a string corresponds to a number of base entity units in the string. 4. The method of claim 1 , wherein the method comprises developing the ontology using at least one curated database by: applying conceptual indexing to plurality of entity units stored in the at least one curated database; identifying semantic associations, between the plurality of entity units, established in the at least one curated database; and identifying at least one class tagged with the plurality of entity units in the at least one curated database. 5. The method of claim 1 , wherein the algorithm used in refining the target data comprises at least one of: natural language processing, text analytics and machine learning techniques. 6. The method of claim 1 , wherein the refining of the target data comprises removing stock entity units from the at least one base entity information. 7. The method of claim 1 , wherein the mapping of the labelled plurality of strings comprises removing entity units stored in a curated English corpus from the at least one base entity information. 8. A system for extracting entity information from target data, wherein the system comprises: a database arrangement operable to store the target data and an ontology; and a processing module communicably coupled to the database arrangement, the processing module operable to: receive the target data; refine the target data to obtain a plurality of base entity units, wherein the target data is refined using an algorithm; generate a plurality of strings based on the plurality of base units, wherein the plurality of strings comprises one or more base entity unit among the plurality of base entity units; sort the plurality of strings in a decreasing order of length; processing the sorted plurality of strings sequentially to identify one or more entity types and establish links between the one or more base entity units of the plurality of base entity units, wherein the entity type refers to a specific field to which the base entity unit is associated with, and wherein the entity type and the established units are identified based on the ontology; assign labels to the one or more entity types; and map the labelled one or more entity types to a predefined signature, wherein the predefined signature relates to a predefined arrangement of the entity types; process the plurality of strings with labelled entity type to identify a pattern similar to the predefined signature; and extract entity information based on the operation of the predefined signature and the plurality of strings. 9. The system of claim 8 , wherein the processing module is further operable to classify the obtained entity information based on the ontology. 10. A non-transitory medium, containing program instructions for execution on a computer system, which when executed by a computer, cause the computer to perform method steps for extracting entity information from target data, the method comprising the steps of: providing the target data; refining the target data to obtain a plurality of base entity units, wherein the target data is refined using an algorithm; generating a plurality of strings based on the plurality of base entity units, wherein the plurality of strings comprises one or more base entity unit among the plurality of base entity units; sorting the plurality of strings in a decreasing order of length; processing the sorted plurality of strings sequentially to identify one or more entity types and establish links between the one or more base entity units of the plurality of base entity units, wherein the entity type refers to a specific field to which the base entity unit is associated with, and wherein the entity type and the established units are identified based on an ontology; assigning labels to the one or more entity types; mapping the labelled one or more entity types to a predefined signature, wherein the predefined signature relates to a predefined arrangement of the entity types; processing the plurality of strings with labelled entity type to identify a pattern similar to the predefined signature; and extracting entity information based on the operation of the predefined signature and the plurality of strings.

Assignees

Innoplexus Ag

Inventors

Classifications

G06F16/951
Indexing; Web crawling techniques · CPC title
G06F40/295Primary
Named entity recognition · CPC title
G06F40/30Primary
Semantic analysis · CPC title
G06F16/30
of unstructured textual data (document management systems G06F16/93) · CPC title

Patent family

Related publications grouped by family.

View patent family 61158213

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11270073B2 cover?: Disclosed is a method and a system for extracting entity information from target data. The method comprises: providing the target data; refining the target data to obtain at least one base entity information having a plurality of base entity units using an algorithm, wherein the algorithm is based on a predefined syntax; generating a plurality of strings for each of the base entity information,…
Who is the assignee on this patent?: Innoplexus Ag
What technology area does this patent fall under?: Primary CPC classification G06F40/295. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 08 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).