Automatic electronic message content extraction method and apparatus

US12222973B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12222973-B2
Application numberUS-202318309788-A
CountryUS
Kind codeB2
Filing dateApr 29, 2023
Priority dateFeb 11, 2019
Publication dateFeb 11, 2025
Grant dateFeb 11, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed are systems and methods for improving interactions with and between computers in electronic messaging, and other, systems supported by or configured with personal computing devices, servers and/or platforms. The systems interact to identify and retrieve data within or across platforms, which can be used to improve the quality of data used in processing interactions between or among processors in such systems. The disclosed systems and methods provide systems and methods for automatically generating data extraction rules, which can then be used to automatically extract data from electronic messages.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method comprising: obtaining, via a computing device, a plurality of electronic messages having an associated set of shared expressions; identifying, via the computing device, for the plurality of electronic messages, a variable value in connection with a respective expression in the set of shared expressions, the variable value indicating that at least a portion of a value corresponding to the respective expression varies across two or more of the plurality of electronic messages; analyzing, via the computing device, the variable value corresponding to multiple electronic messages of the plurality of electronic messages to determine an annotation indicative of a meaning of the variable value; automatically generating, via the computing device and for the plurality of electronic messages, a data extraction rule comprising the annotation associated with the respective expression having the variable value; receiving, via the computing device, over an electronic communications network, an electronic message; using, via the computing device, the automatically-generated data extraction rule to extract data from the received electronic message; and causing, via the computing device, the data extracted from the received electronic message to be displayed at a user device. 2. A method of claim 1 , further comprising: communicating, via the computing device, the automatically extracted data to the user device. 3. A method of claim 2 , wherein the communicating is responsive to a data extraction rule creation request received from the user. 4. A method of claim 1 , wherein the causing further comprises: communicating, via the computing device, the annotation associated with the respective expression to the user device, communication of the annotation causing the annotation to be displayed along with the extracted data in an electronic message information summary display of the user device. 5. The method of claim 1 , further comprising communicating at least some of the extracted data and the annotation to a search engine for use in generating a search index for use in electronic message searching. 6. The method of claim 1 , further comprising communicating at least some of the extracted data and annotation to a recommendation system for use in determining at least one interest of a user of the recommendation system, the at least one interest of the user for use in making at least one recommendation to the user. 7. The method of claim 6 , the recommendation system comprises an advertising content recommendation system, and the at least one interest is for use in determining advertising content for the user. 8. A method of claim 1 , further comprising: using a value corresponding to at least one other expression of the set of shared expressions to determine the meaning of the variable value. 9. A method of claim 1 , wherein each expression in the set of shared expressions is an XPATH expression. 10. The method of claim 1 , using the automatically-generated data extraction rule to extract data from a received electronic message further comprising: associating, via the computing device, the variable value for the respective expression with the associated annotation indicative of the meaning of the variable value. 11. The method of claim 1 , wherein the plurality of electronic messages have a common sender domain. 12. The method of claim 1 , determining the annotation for the variable value of the respective expression further comprising: determining, by searching a dictionary using some or all of the variable value, that the dictionary includes at least a portion of the variable value, the dictionary having an associated annotation; and using the associated annotation of the dictionary as the annotation for the variable value. 13. The method of claim 1 , determining the annotation for the variable value of the respective expression further comprising: determining, using a pattern recognition analyzer, a pattern of at least a portion of the variable value; and using an associated annotation of the pattern as the annotation for the variable value. 14. The method of claim 1 , further comprising automatically refining the annotation associated with the respective expression, the refining comprising: generating, via the computing device, training data across the set of shared expressions associated with the plurality of electronic messages, the training data comprising a plurality of training examples, each training example comprising a set of features and corresponding feature values; training, via the computing device and using machine learning, an annotation refinement model for use in refining the annotation; generating, via the computing device, a set of features for the annotation; and using the annotation refinement model trained with the set of features generated for the annotation to refine the annotation associated with the respective expression. 15. The method of claim 1 , automatically extracting data from the electronic message further comprising: using the respective expression to retrieve the variable value from the received electronic message. 16. The method of claim 1 , wherein the plurality of electronic messages share a digital signature determined using the set of shared expressions. 17. The method of claim 16 , further comprising: before automatically extracting data from the received electronic message: determining a digital signature for the received electronic message; and determining, via the computing device, that the digital signature determined for the received electronic message matches the digital signature shared by the plurality of electronic messages. 18. The method of claim 17 , further comprising: determining that a sender domain of the received electronic message matches a sender domain shared by the plurality of electronic messages. 19. A non-transitory computer-readable storage medium tangibly encoded with computer-executable instructions that when executed by a processor associated with a computing device perform a method comprising: obtaining a plurality of electronic messages having an associated set of shared expressions; identifying for the plurality of electronic messages, a variable value in connection with a respective expression in the set of shared expressions, the variable value indicating that at least a portion of a value corresponding to the respective expression varies across two or more of the plurality of electronic messages; analyzing the variable value corresponding to multiple electronic messages of the plurality of electronic messages to determine an annotation indicative of a meaning of the variable value; automatically generating, for the plurality of electronic messages, a data extraction rule comprising the annotation associated with the respective expression having the variable value; receiving, over an electronic communications network, an electronic message; using the automatically-generated data extraction rule to extract data from a received electronic message; and causing the data extracted from the received electronic message to be displayed at a user device. 20. A computing device comprising: a processor; and a non-transitory storage medium for tangibly storing thereon program logic for execution by the processor, the program logic comprising: obtaining logic executed by the processor for obtaining a plurality of electronic messages having an associated set of shared expressions; iden

Assignees

Inventors

Classifications

  • G06F16/345Primary

    Summarisation for human users · CPC title

  • Querying · CPC title

  • Machine learning · CPC title

  • Inference or reasoning models · CPC title

  • G06F16/35Primary

    Clustering; Classification · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12222973B2 cover?
Disclosed are systems and methods for improving interactions with and between computers in electronic messaging, and other, systems supported by or configured with personal computing devices, servers and/or platforms. The systems interact to identify and retrieve data within or across platforms, which can be used to improve the quality of data used in processing interactions between or among pr…
Who is the assignee on this patent?
Yahoo Assets Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/345. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 11 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).