What technology area does this patent fall under?

Primary CPC classification G06Q30/0202. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jun 27 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Privacy preserving document analysis

US11689507B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11689507-B2
Application number	US-201916695636-A
Country	US
Kind code	B2
Filing date	Nov 26, 2019
Priority date	Nov 26, 2019
Publication date	Jun 27, 2023
Grant date	Jun 27, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and techniques for privacy preserving document analysis are described that derive insights pertaining to a digital document without communication of the content of the digital document. To do so, the privacy preserving document analysis techniques described herein capture visual or contextual features of the digital document and creates a stamp representation that represents these features without included the content of the digital document. The stamp representation is projected into a stamp embedding space based on a stamp encoding model generated through machine learning techniques capturing feature patterns and interaction in the stamp representations. The stamp encoding model exploits these feature interactions to define similarity of source documents based on location within the stamp embedding space. Accordingly, the techniques described herein can determine a similarity of documents without having access to the documents themselves.

First claim

Opening claim text (preview).

What is claimed is: 1. In a digital medium environment for privacy preserving document analysis, a method implemented by at least one computing device, the method comprising: populating, by the at least one computing device, a stamp embedding space by processing a plurality of stamp representations with a trained stamp encoding model to create a plurality of stamp embeddings, each respective one of the plurality of stamp representations corresponding to a respective source document and containing information derived from the respective source document without containing text or images of the respective source document, wherein the plurality of stamp embeddings characterizes features of the plurality of stamp representations in a plurality of numerical values; generating, by the at least one computing device, a plurality of clusters within the stamp embedding space based on locations of the plurality of stamp embeddings within the stamp embedding space, wherein the locations of the plurality of stamp embeddings are based on the plurality of numerical values; receiving, by the at least one computing device, an additional stamp representation; projecting, by the at least one computing device, the additional stamp representation into the stamp embedding space by processing the additional stamp representation with the stamp encoding model to create an additional stamp embedding; and comparing, by the at least one computing device, a location of the additional stamp embedding within the stamp embedding space with the plurality of clusters for use in deriving insights pertaining to a document corresponding to the additional stamp representation. 2. The method of claim 1 , wherein the receiving, projecting, and comparing is performed for a plurality of additional stamp representations. 3. The method of claim 2 , wherein the plurality of stamp representations are associated with a document corpus, the plurality of additional stamp representations are received from a plurality of client devices, and further comprising: determining a first document distribution with respect to the plurality of clusters for the stamp embeddings associated with the plurality of stamp representations; determining a second document distribution with respect to the plurality of clusters for the stamp embeddings associated with the plurality of additional stamp representations; and adjusting the documents included in the document corpus based on the first and second document distributions. 4. The method of claim 3 , wherein the adjusting includes identifying an out-of-distribution document of a type and adding at least one document of the type to the document corpus. 5. The method of claim 1 , further comprising retrieving, based on the plurality of clusters, at least one stamp representation of the plurality of stamp representations based on a similarity in the stamp embedding space to the additional stamp representation. 6. The method of claim 5 , further comprising retrieving the source document corresponding to the at least one stamp representation, and outputting the source document for display in a user interface. 7. The method of claim 1 , further comprising determining, based on the plurality of clusters, a probability of retention of a customer associated with the additional stamp representation. 8. The method of claim 1 , wherein the stamp embedding space is configured to represent similarity of documents based on user experience, and further comprising predicting a user satisfaction of a user associated with the additional stamp representation based on the plurality of clusters. 9. The method of claim 1 , further comprising determining, based on the plurality of clusters, expectations of a user associated with the additional stamp representation, and tracking the expectations over time. 10. A system comprising: a processor; and computer-readable storage media having stored instructions that, responsive to execution by the processor, cause the processor to perform operations including: populating a stamp embedding space by processing a plurality of stamp representations with a trained stamp encoding model to create a plurality of stamp embeddings, each respective one of the plurality of stamp representations corresponding to a respective source document and containing information derived from the respective source document without containing text or images of the respective source document, wherein the plurality of stamp embeddings characterizes features of the plurality of stamp representations in a plurality of numerical values; generating a plurality of clusters within the stamp embedding space based on locations of the plurality of stamp embeddings within the stamp embedding space, wherein the locations of the plurality of stamp embeddings are based on the plurality of numerical values; receiving an additional stamp representation; projecting the additional stamp representation into the stamp embedding space by processing the additional stamp representation with the stamp encoding model to create an additional stamp embedding; and comparing a location of the additional stamp embedding within the stamp embedding space with the plurality of clusters for use in deriving insights pertaining to a document corresponding to the additional stamp representation. 11. The system of claim 10 , wherein the receiving, projecting, and comparing is performed for a plurality of additional stamp representations. 12. The system of claim 11 , wherein the plurality of stamp representations are associated with a document corpus, the plurality of additional stamp representations are received from a plurality of client devices, and further comprising: determining a first document distribution with respect to the plurality of clusters for the stamp embeddings associated with the plurality of stamp representations; determining a second document distribution with respect to the plurality of clusters for the stamp embeddings associated with the plurality of additional stamp representations; and adjusting the documents included in the document corpus based on the first and second document distributions. 13. The system of claim 12 , wherein the adjusting includes identifying an out-of-distribution document of a type and adding at least one document of the type to the document corpus. 14. The system of claim 10 , further comprising retrieving, based on the plurality of clusters, at least one stamp representation of the plurality of stamp representations based on a similarity in the stamp embedding space to the additional stamp representation. 15. The system of claim 14 , further comprising retrieving the source document corresponding to the at least one stamp representation, and outputting the source document for display in a user interface. 16. The system of claim 10 , further comprising determining, based on the plurality of clusters, a probability of retention of a customer associated with the additional stamp representation. 17. The system of claim 10 , wherein the stamp embedding space is configured to represent similarity of documents based on user experience, and further comprising predicting a user satisfaction of a user associated with the additional stamp representation based on the plurality of clusters. 18. The system of claim 10 , further comprising determining, based on the plurality of clusters, expectations of a user associated with the additional stamp representation, and tracking the expectations for an amount of time. 19. One or more computer-readable stora

Assignees

Adobe Inc

Inventors

Classifications

G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06N3/09
Supervised learning · CPC title
G06Q30/0202Primary
Market predictions or forecasting for commercial activities · CPC title
G06N5/04
Inference or reasoning models · CPC title
G06N20/00
Machine learning · CPC title

Patent family

Related publications grouped by family.

View patent family 75975157

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11689507B2 cover?: Systems and techniques for privacy preserving document analysis are described that derive insights pertaining to a digital document without communication of the content of the digital document. To do so, the privacy preserving document analysis techniques described herein capture visual or contextual features of the digital document and creates a stamp representation that represents these featu…
Who is the assignee on this patent?: Adobe Inc
What technology area does this patent fall under?: Primary CPC classification G06Q30/0202. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jun 27 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Creating and using triplet representations to assess similarity between job description documents

Dynamically optimizing user engagement

Collaborative feature learning from social media

Methods, systems, and media for recommending content items based on topics

Frequently asked questions