Systems and methods for removal of attributes from multi-modality and multi-attribute data

US2024104070A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2024104070-A1
Application numberUS-202217934935-A
CountryUS
Kind codeA1
Filing dateSep 23, 2022
Priority dateSep 23, 2022
Publication dateMar 28, 2024
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for removing uninterested attributes from multi-modality data may include: receiving, by a multi-modality attribute removal computer program executed by an electronic device, multi-modality data comprising a plurality of modalities from a data source, wherein data in each modality are related; receiving, by the multi-modality attribute removal computer program, an uninterested attribute in the multi-modality data to remove; training, by the multi-modality attribute removal computer program, a modality-focused encoder for each modality of the multi-modality data to remove the uninterested attribute using a removal loss and a retention loss for the respective modality; receiving, by the multi-modality attribute removal computer program, a multi-modality data set for processing; and processing, by the multi-modality attribute removal computer program, the multi-modality data set using the modality-focused encoders, wherein the processing results in a processed multi-modality data set with the uninterested attribute removed and one or more interested attribute retained.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method for removing uninterested attributes from multi-modality data, comprising: receiving, by a multi-modality attribute removal computer program executed by an electronic device, multi-modality data comprising a plurality of modalities from a data source, wherein data in each modality are related; receiving, by the multi-modality attribute removal computer program, an uninterested attribute in the multi-modality data to remove; training, by the multi-modality attribute removal computer program, a modality-focused encoder for each modality of the multi-modality data to remove the uninterested attribute using a removal loss and a retention loss for the respective modality; receiving, by the multi-modality attribute removal computer program, a multi-modality data set for processing; and processing, by the multi-modality attribute removal computer program, the multi-modality data set using the modality-focused encoders, wherein the processing results in a processed multi-modality data set with the uninterested attribute removed and one or more interested attribute retained. 2 . The method of claim 1 , wherein the removal loss comprises a cosine distance, and the retention loss comprises a L2 norm. 3 . The method of claim 1 , further comprising: pre-training, by the multi-modality attribute removal computer program, a single modality reidentification classifier for each modality of the multi-modality data and a multi-modality reidentification classifier for the multi-modality data to remove the uninterested attribute resulting in removal losses and retention losses. 4 . The method of claim 3 , wherein the removal losses represent a loss associated with reidentification of the uninterested attribute in outputs of the single modality reidentification classifiers and the multi-modality reidentification classifier, and the retention losses represent a loss of utility of retained data in the outputs of the single modality reidentification classifiers and the multi-modality reidentification classifier. 5 . The method of claim 3 , wherein the single modality reidentification classifiers and the multi-modality reidentification classifier are pre-trained by with cross-entropy loss and back propagation with Stochastic Gradient Descent (SGD). 6 . The method of claim 1 , wherein the step of training the modality-focused encoder for each modality of the multi-modality data using the removal loss and the retention loss for the respective modality comprises: receiving, by the multi-modality attribute removal computer program, a plurality of additional multi-modality data sets; and receiving, by the multi-modality attribute removal computer program, an identification of an uninterested attribute in the multi-modality data to conceal and an interested attribute in the multi-modality data to retain; wherein the modality-focused encoders are trained using the removal loss and the retention loss for the respective modality, the plurality of additional multi-modality data sets, the uninterested attribute, and the interested attribute. 7 . A system, comprising: a multi-modality data source comprising multi-modality data comprising a, wherein data in each modality are related; and a multi-modality attribute removal computer program comprising a modality-focused encoder for each modality in the multi-modality data, a modality reidentification classifier for each modality in the multi-modality data, and a multi-modality reidentification classifier; wherein: the multi-modality attribute removal computer program receives the multi-modality data from the multi-modality data source; the multi-modality attribute removal computer program receives an uninterested attribute in the multi-modality data to remove; the multi-modality attribute removal computer program trains the modality-focused encoders to remove the uninterested attribute using a removal loss and a retention loss for the respective modality; the multi-modality attribute removal computer program receives a multi-modality data set for processing; and the multi-modality attribute removal computer program processes the multi-modality data set using the modality-focused encoders, wherein the processing results in a processed multi-modality data set with the uninterested attribute removed and one or more interested attribute retained. 8 . The system of claim 7 , wherein the removal loss comprises a cosine distance, and the retention loss comprises a L2 norm. 9 . The system of claim 7 , wherein the multi-modality attribute removal computer program pre-trains a single modality reidentification classifier for each modality of the multi-modality data and a multi-modality reidentification classifier for the multi-modality data to remove the uninterested attribute resulting in removal losses and retention losses. 10 . The system of claim 9 , wherein the removal losses represent a loss associated with reidentification of the uninterested attribute in outputs of the single modality reidentification classifiers and the multi-modality reidentification classifier, and the retention losses represent a loss of utility of retained data in the outputs of the single modality reidentification classifiers and the multi-modality reidentification classifier. 11 . The system of claim 9 , wherein the single modality reidentification classifiers and the multi-modality reidentification classifier are pre-trained by with cross-entropy loss and back propagation with Stochastic Gradient Descent (SGD). 12 . The system of claim 7 , wherein the modality-focused encoder are trained by: receiving a plurality of additional multi-modality data sets; and receiving an identification of an uninterested attribute in the multi-modality data to conceal and an interested attribute in the multi-modality data to retain; wherein the modality-focused encoders are trained using the removal loss and the retention loss for the respective modality, the plurality of additional multi-modality data sets, the uninterested attribute, and the interested attribute. 13 . The system of claim 7 , wherein the uninterested attribute comprises identity. 14 . A non-transitory computer readable storage medium, including instructions stored thereon, which when read and executed by one or more computer processors, cause the one or more computer processors to perform steps comprising: receiving multi-modality data comprising a plurality of modalities from a data source, wherein data in each modality are related; receiving an uninterested attribute in the multi-modality data to remove; training a modality-focused encoder for each modality of the multi-modality data to remove the uninterested attribute using a removal loss and a retention loss for the respective modality; receiving a multi-modality data set for processing; and processing the multi-modality data set using the modality-focused encoders, wherein the processing results in a processed multi-modality data set with the uninterested attribute removed and one or more interested attribute retained. 15 . The non-transitory computer readable storage medium of claim 14 , wherein the removal loss comprises a cosine distance, and the retention loss comprises a L2 norm. 16 . The non-transitory computer readable storage medium of claim 14 , further comprising instructions stored thereon, which when read and executed by one or more computer processors, cause the one or more computer processors to pre-train a single modality reidentification classifier for each modality of the multi-modality data and a multi-m

Assignees

Inventors

Classifications

  • G06F16/215Primary

    Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title

  • Clustering or classification · CPC title

  • G06F16/254Primary

    Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2024104070A1 cover?
A method for removing uninterested attributes from multi-modality data may include: receiving, by a multi-modality attribute removal computer program executed by an electronic device, multi-modality data comprising a plurality of modalities from a data source, wherein data in each modality are related; receiving, by the multi-modality attribute removal computer program, an uninterested attribut…
Who is the assignee on this patent?
Jpmorgan Chase Bank Na
What technology area does this patent fall under?
Primary CPC classification G06F16/215. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Mar 28 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).