Method and system for resume data extraction

US12373794B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12373794-B2
Application numberUS-202017596041-A
CountryUS
Kind codeB2
Filing dateDec 10, 2020
Priority dateDec 10, 2020
Publication dateJul 29, 2025
Grant dateJul 29, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for extracting information from a resume of an applicant are provided. The method includes: receiving a resume that relates to an applicant; extracting information that relates to applicant attributes from the resume; comparing the extracted information with a predetermined list of job-specific skills and with a predetermined list of characteristics that relate to soft skills; using the extracted information to determine applicant achievements; and determining at least one skill that corresponds to the applicant. The method may be implemented by using a deep learning technique and/or a natural language processing technique.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for extracting information from a resume of an applicant, the method being implemented by at least one processor, the method comprising: receiving, by the at least one processor, a first resume that relates to a first applicant; using, by the at least one processor, Natural Language Processing (NLP) and machine learning that implements an algorithm for: extracting, from the received first resume, first information that relates to at least one applicant attribute, wherein the extracting includes transforming each respective line of the resume into a multi-dimensional vector of numbers, wherein each multi-dimensional vector of numbers is generated by averaging each individual representation for each respective word from the respective line; performing, by the at least one processor and based on each respective multi-dimensional vector of numbers, a first analysis to identify at least one prior experience from the extracted first information by using the NLP and the machine learning to identify at least one previous employer of the first applicant from the first resume, wherein the first analysis comprises: determining whether a each respective line within the first resume is likely to be a header for an experience section by calculating a difference between a similarity of each respective line to job-related headers and to non-job-related headers; determining whether the respective line refers to an entity name; determining whether the respective line refers to a business entity designation; determining whether the respective line or an adjacent line to the respective line refers to at least one from among a time period and a location; and identifying, via a decision tree, whether each respective line includes a place of employment based on a result of the determining of whether each respective line within the first resume is likely to be a header for an experience section, a result of the determining of whether the respective line refers to an entity name, a result of the determining of whether the respective line refers to a business entity designation, and a result of the determining of whether the respective line or an adjacent line to the respective line refers to at least one from among a time period and a location; performing, by the at least one processor via a sentence-Bidirectional Encoder Representations from Transformers (sentence-BERT) model, a second analysis to identify at least one soft skill from the extracted first information by using the NLP and the machine learning to perform contextual inference on the extracted first information by generating numerical representations of each sentence within the resume and comparing to a predetermined list of numerical representations to identify the at least one soft skill from the first resume, wherein the at least one soft skill describes a personal attribute of the applicant; performing, by the at least one processor via a term frequency/inverse document frequency (TF/IDF) technique, a third analysis to identify at least one job-specific skill from the extracted first information by using the NLP and the machine learning to learn a list of skills from a plurality of previously received resumes, comparing the extracted first information with the list of skills, and determining, based on a result of the comparing, the at least one job-specific skill that corresponds to the first applicant; and assigning, by the at least one processor via a knowledge graph, a hash value to the resume based on the determined at least one job-specific skill and the identified at least one soft skill. 2. The method of claim 1 , further comprising annotating the first resume based on a result of the determining. 3. The method of claim 1 , wherein: the identifying of the at least one soft skill that corresponds to the first applicant comprises using a 0.8 cosine similarity threshold with respect to a result of the performing of the second analysis. 4. The method of claim 1 , further comprising performing, using the extracted first information, a fourth analysis to determine at least one achievement that relates to the first applicant, wherein the at least one achievement comprises at least one from among an award and a completion of a job-specific task. 5. The method of claim 4 , wherein the using of the extracted first information to determine at least one achievement that relates to the first applicant comprises performing a part-of-speech (POS) tagging operation with respect to the extracted first information to determine the at least one achievement. 6. The method of claim 4 , further comprising: performing a fifth analysis to predict the applicant's capacity to perform a job based on the extracted first information. 7. The method of claim 1 , further comprising: using the knowledge graph to generate background information regarding the identified at least one previous employer, wherein the background information includes an industry identification of the at least one previous employer, and wherein each respective line that is identified as including a place of employment is searched against the knowledge graph to identify a proper entity name and list the generated background information. 8. A computing apparatus for extracting information from a resume of an applicant, the computing apparatus comprising: a processor; a memory; and a communication interface coupled to each of the processor and the memory, wherein the processor is configured to: receive, via the communication interface, a first resume that relates to a first applicant; use an artificial intelligence (AI) based framework that uses Natural Language Processing (NLP) and machine learning that implements an algorithm to: extract, from the received first resume, first information that relates to at least one applicant attribute, wherein the extracting includes transforming each respective line of the resume into a multi-dimensional vector of numbers, wherein each multi-dimensional vector of numbers is generated by averaging each individual representation for each respective word from the respective line; perform, based on each respective multi-dimensional vector of numbers, a first analysis to identify at least one prior experience from the extracted first information by using the NLP and the machine learning to identify at least one previous employer of the first applicant from the first resume, wherein the first analysis comprises: determining whether each respective line within the first resume is likely to be a header for an experience section by calculating a difference between a similarity of each respective line to job-related headers and to non-job-related headers; determining whether the respective line refers to an entity name; determining whether the respective line refers to a business entity designation; determining whether the respective line or an adjacent line to the respective line refers to at least one from among a time period and a location; and identifying, via a decision tree, whether each respective line includes a place of employment based on a result of the determining of whether each respective line within the first resume is likely to be a header for an experience section, a result of the determining of whether the respective line refers to an entity name, a result of the determining of whether the respective line refers to a business entity designation, and a result of the determining of whether the respective line or an adjacent line to the respective line refers to at least one from among a time period and a location; perform, via a sentence-Bidirectional Encoder Representations from Transformers (sentence-BERT) model, a second analysis to identify at least one soft skill

Assignees

Inventors

Classifications

  • Supervised learning · CPC title

  • Combinations of networks · CPC title

  • Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title

  • Knowledge representation; Symbolic representation · CPC title

  • Employment or hiring · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12373794B2 cover?
Systems and methods for extracting information from a resume of an applicant are provided. The method includes: receiving a resume that relates to an applicant; extracting information that relates to applicant attributes from the resume; comparing the extracted information with a predetermined list of job-specific skills and with a predetermined list of characteristics that relate to soft skill…
Who is the assignee on this patent?
Jpmorgan Chase Bank Na
What technology area does this patent fall under?
Primary CPC classification G06Q10/1053. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 29 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).