Populating user contact entries
US-2015358447-A1 · Dec 10, 2015 · US
US10007894B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10007894-B2 |
| Application number | US-201514805631-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 22, 2015 |
| Priority date | Jul 22, 2015 |
| Publication date | Jun 26, 2018 |
| Grant date | Jun 26, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computer processor may extract identity information from a document. The identity information may include at least one custodian identity attribute. After extracting the identity information, the computer processor may determine that the identity information is associated with a specific custodian. The computer processor may then search for the custodian identity attribute in a custodian directory to determine whether the custodian directory contains an entry for the custodian. If the custodian is not in the custodian directory, the computer processor may create a new entry in the custodian directory for the custodian and store the extracted identity information in the new entry.
Opening claim text (preview).
What is claimed is: 1. A system for maintaining a custodian directory, the system comprising: a memory; and a processor communicatively coupled to the memory, where the processor is configured to perform a method comprising: extracting identity information from a document, the identity information including a custodian identity attribute; and determining that the identity information is associated with a first custodian; searching for the custodian identity attribute in the custodian directory; creating, in response to determining that the first custodian is not in the custodian directory, a new entry for the first custodian in the custodian directory, the new entry including the identity information; updating, in response to determining that the first custodian is in the custodian directory, an entry for the first custodian in the custodian directory using the extracted identity information; and carrying out a cleanup of the custodian directory by: identifying two or more entries in the custodian directory that have at least one matching custodian identity attribute; determining a weighting factor for each field in the custodian directory, wherein the weighting factor for each respective field is based on a likelihood that the custodian identity attribute for the respective field is unique to a single custodian; generating a relationship score for the two or more entries by comparing the identity information in the two or more entries and using the weighting factors, the relationship score being a numeric score that indicates a level of similarity between the two or more entries; determining that the relationship score exceeds a confidence threshold; determining, based on the relationship score exceeding the confidence threshold, that all of the two or more entries in the custodian directory relate to a particular custodian; and merging, in response to determining that all of the two or more entries relate to the particular custodian, the two or more entries in the custodian directory. 2. The system of claim 1 , wherein the identity information is a name, and wherein the identifying two or more entries that relate to a particular custodian comprises: identifying a first name in a first entry in the custodian directory; identifying a second name in a second entry in the custodian directory; determining that the first name is an alternative name for the second name. 3. The system of claim 1 , wherein the method performed by the processor further comprises: identifying a first entry in the custodian directory; determining, using information in the custodian directory, that the first entry corresponds to a customer; and transmitting, in response to determining that the first entry corresponds to the customer, the first entry to a customer relationship management (CRM) system. 4. The system of claim 1 , wherein extracting the identity information includes extracting information from a body of the document using natural language processing and extracting information from metadata of the document, wherein the identity information includes a second custodian identity attribute extracted from the metadata and a third custodian identity attribute extracted from the body of the document, and wherein the method performed by the processor further comprises: determining that the second custodian identity attribute is associated with a second custodian; determining, based on a field of the metadata where the second custodian identity attribute was extracted from and a location in the body of the document that the third custodian identity attribute was extracted from, that the second custodian identity attribute and the third custodian identity attribute are associated with the same custodian; searching for the second custodian identity attribute in the custodian directory; determining, based on the searching for the second custodian identity attribute, that an existing entry exists for the second custodian in the custodian directory; determining a type of custodian identity attribute for the third custodian identity attribute; comparing the third custodian identity attribute to a corresponding field in the existing entry using the type of custodian identity attribute; determining, based on comparing the third custodian identity attribute to the corresponding field, that the third custodian identity attribute does not match a value stored in the corresponding field; and updating, in response to determining that the third custodian identity attribute does not match the value stored in the corresponding field, the existing entry for the second custodian by storing the third custodian identity attribute in the custodian directory. 5. The system of claim 1 , wherein the identity information includes a plurality of custodian identity attributes, and wherein determining whether the first custodian is in the custodian directory comprises: comparing each custodian identity attribute of the plurality of custodian identity attributes to fields in the custodian directory; determining that at least one custodian identity attribute of the plurality of custodian identity attributes matches a first value in a first entry in the custodian directory; comparing each custodian identity attribute of the plurality of custodian identity attributes to corresponding fields in the first entry; generating a comparison score for the first entry using fuzzy logic matching; and comparing the comparison score to a threshold, the threshold being a minimum score that a potential match has to obtain to be considered a match, the threshold being automatically determined by the processor based on historical data relating to custodian directory matches. 6. The system of claim 1 , wherein extracting the identity information from the document includes: extracting a plurality of custodian identity attributes, wherein a first custodian identity attribute is extracted from metadata of the document and a second custodian identity attribute is extracted from a body of the document; grouping the plurality of custodian identity attributes according to a location in the document from which each custodian identity attribute was extracted, wherein grouping the plurality of custodian identity attributes includes grouping the first and second custodian identity attributes together. 7. A computer program product for maintaining a custodian directory, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, wherein the computer readable storage medium is not a transitory signal per se, the program instruction executable by a processor to cause the processor to perform a method comprising: extracting identity information from a document, the identity information including a custodian identity attribute; determining that the identity information is associated with a first custodian; searching for the custodian identity attribute in the custodian directory; creating, in response to determining that the first custodian is not in the custodian directory, a new entry for the first custodian in the custodian directory, the new entry including the identity information: updating, in response to determining that the first custodian is in the custodian directory, an entry for the first custodian in the custodian directory using the extracted identity information; and carrying out a cleanup of the custodian directory by: identifying two or more entries in the custodian directory that have at least one matching custodian identity attribute; determining a weighting factor for each field in the custodian directory, wherein the weighting factor for each respective field is based on a likelihood that the custodian identity attribute for the
Computer-aided management of electronic mailing [e-mailing] · CPC title
File access structures, e.g. distributed indices (arrangements of input from, or output to, record carriers G06F3/06) · CPC title
Parsing · CPC title
Document management systems · CPC title
Human resources · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.