Identifying subjective attributes by analysis of curation signals
US-9811780-B1 · Nov 7, 2017 · US
US2024104070A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2024104070-A1 |
| Application number | US-202217934935-A |
| Country | US |
| Kind code | A1 |
| Filing date | Sep 23, 2022 |
| Priority date | Sep 23, 2022 |
| Publication date | Mar 28, 2024 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for removing uninterested attributes from multi-modality data may include: receiving, by a multi-modality attribute removal computer program executed by an electronic device, multi-modality data comprising a plurality of modalities from a data source, wherein data in each modality are related; receiving, by the multi-modality attribute removal computer program, an uninterested attribute in the multi-modality data to remove; training, by the multi-modality attribute removal computer program, a modality-focused encoder for each modality of the multi-modality data to remove the uninterested attribute using a removal loss and a retention loss for the respective modality; receiving, by the multi-modality attribute removal computer program, a multi-modality data set for processing; and processing, by the multi-modality attribute removal computer program, the multi-modality data set using the modality-focused encoders, wherein the processing results in a processed multi-modality data set with the uninterested attribute removed and one or more interested attribute retained.
Opening claim text (preview).
What is claimed is: 1 . A method for removing uninterested attributes from multi-modality data, comprising: receiving, by a multi-modality attribute removal computer program executed by an electronic device, multi-modality data comprising a plurality of modalities from a data source, wherein data in each modality are related; receiving, by the multi-modality attribute removal computer program, an uninterested attribute in the multi-modality data to remove; training, by the multi-modality attribute removal computer program, a modality-focused encoder for each modality of the multi-modality data to remove the uninterested attribute using a removal loss and a retention loss for the respective modality; receiving, by the multi-modality attribute removal computer program, a multi-modality data set for processing; and processing, by the multi-modality attribute removal computer program, the multi-modality data set using the modality-focused encoders, wherein the processing results in a processed multi-modality data set with the uninterested attribute removed and one or more interested attribute retained. 2 . The method of claim 1 , wherein the removal loss comprises a cosine distance, and the retention loss comprises a L2 norm. 3 . The method of claim 1 , further comprising: pre-training, by the multi-modality attribute removal computer program, a single modality reidentification classifier for each modality of the multi-modality data and a multi-modality reidentification classifier for the multi-modality data to remove the uninterested attribute resulting in removal losses and retention losses. 4 . The method of claim 3 , wherein the removal losses represent a loss associated with reidentification of the uninterested attribute in outputs of the single modality reidentification classifiers and the multi-modality reidentification classifier, and the retention losses represent a loss of utility of retained data in the outputs of the single modality reidentification classifiers and the multi-modality reidentification classifier. 5 . The method of claim 3 , wherein the single modality reidentification classifiers and the multi-modality reidentification classifier are pre-trained by with cross-entropy loss and back propagation with Stochastic Gradient Descent (SGD). 6 . The method of claim 1 , wherein the step of training the modality-focused encoder for each modality of the multi-modality data using the removal loss and the retention loss for the respective modality comprises: receiving, by the multi-modality attribute removal computer program, a plurality of additional multi-modality data sets; and receiving, by the multi-modality attribute removal computer program, an identification of an uninterested attribute in the multi-modality data to conceal and an interested attribute in the multi-modality data to retain; wherein the modality-focused encoders are trained using the removal loss and the retention loss for the respective modality, the plurality of additional multi-modality data sets, the uninterested attribute, and the interested attribute. 7 . A system, comprising: a multi-modality data source comprising multi-modality data comprising a, wherein data in each modality are related; and a multi-modality attribute removal computer program comprising a modality-focused encoder for each modality in the multi-modality data, a modality reidentification classifier for each modality in the multi-modality data, and a multi-modality reidentification classifier; wherein: the multi-modality attribute removal computer program receives the multi-modality data from the multi-modality data source; the multi-modality attribute removal computer program receives an uninterested attribute in the multi-modality data to remove; the multi-modality attribute removal computer program trains the modality-focused encoders to remove the uninterested attribute using a removal loss and a retention loss for the respective modality; the multi-modality attribute removal computer program receives a multi-modality data set for processing; and the multi-modality attribute removal computer program processes the multi-modality data set using the modality-focused encoders, wherein the processing results in a processed multi-modality data set with the uninterested attribute removed and one or more interested attribute retained. 8 . The system of claim 7 , wherein the removal loss comprises a cosine distance, and the retention loss comprises a L2 norm. 9 . The system of claim 7 , wherein the multi-modality attribute removal computer program pre-trains a single modality reidentification classifier for each modality of the multi-modality data and a multi-modality reidentification classifier for the multi-modality data to remove the uninterested attribute resulting in removal losses and retention losses. 10 . The system of claim 9 , wherein the removal losses represent a loss associated with reidentification of the uninterested attribute in outputs of the single modality reidentification classifiers and the multi-modality reidentification classifier, and the retention losses represent a loss of utility of retained data in the outputs of the single modality reidentification classifiers and the multi-modality reidentification classifier. 11 . The system of claim 9 , wherein the single modality reidentification classifiers and the multi-modality reidentification classifier are pre-trained by with cross-entropy loss and back propagation with Stochastic Gradient Descent (SGD). 12 . The system of claim 7 , wherein the modality-focused encoder are trained by: receiving a plurality of additional multi-modality data sets; and receiving an identification of an uninterested attribute in the multi-modality data to conceal and an interested attribute in the multi-modality data to retain; wherein the modality-focused encoders are trained using the removal loss and the retention loss for the respective modality, the plurality of additional multi-modality data sets, the uninterested attribute, and the interested attribute. 13 . The system of claim 7 , wherein the uninterested attribute comprises identity. 14 . A non-transitory computer readable storage medium, including instructions stored thereon, which when read and executed by one or more computer processors, cause the one or more computer processors to perform steps comprising: receiving multi-modality data comprising a plurality of modalities from a data source, wherein data in each modality are related; receiving an uninterested attribute in the multi-modality data to remove; training a modality-focused encoder for each modality of the multi-modality data to remove the uninterested attribute using a removal loss and a retention loss for the respective modality; receiving a multi-modality data set for processing; and processing the multi-modality data set using the modality-focused encoders, wherein the processing results in a processed multi-modality data set with the uninterested attribute removed and one or more interested attribute retained. 15 . The non-transitory computer readable storage medium of claim 14 , wherein the removal loss comprises a cosine distance, and the retention loss comprises a L2 norm. 16 . The non-transitory computer readable storage medium of claim 14 , further comprising instructions stored thereon, which when read and executed by one or more computer processors, cause the one or more computer processors to pre-train a single modality reidentification classifier for each modality of the multi-modality data and a multi-m
Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title
Clustering or classification · CPC title
Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.