Who is the assignee on this patent?

American Express Travel Services Company Inc, American Express Travel Related Services Co Inc

What technology area does this patent fall under?

Primary CPC classification G06N5/04. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Oct 07 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Featuring engineering based on semantic types

US12436985B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12436985-B2
Application number	US-202418733650-A
Country	US
Kind code	B2
Filing date	Jun 4, 2024
Priority date	Dec 1, 2021
Publication date	Oct 7, 2025
Grant date	Oct 7, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-based system may engineer features based on semantic types. The computer-based system may implement deep learning algorithms and derive a domain-specific feature engineering strategy from semantic type predictions and data profiling. The computer-based system may utilize embedded domain (e.g., financial industry, etc.) knowledge to generate curated features from raw data (e.g., transactional datasets, relational datasets, etc.).

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for determining features from raw data, the method comprising: receiving a plurality of data structures of raw data, wherein each data structure of the plurality of data structures comprises a respective plurality of data elements; determining a data profile for the raw data based on an amount of data elements of the respective plurality of data elements for at least one data structure of the plurality of data structures satisfying a statistical threshold for indicating the data profile; and for each data structure of the plurality of data structures of the raw data: determining, based on a semantic rule that describes how to infer a semantic type from a data element of the respective plurality of data elements, the semantic type for each data structure, and selecting, based on the determined semantic type, an instruction that describes how to calculate an input feature for a machine learning model based on the respective plurality of data elements for each data structure, and wherein the semantic types for at least a portion of data structures of the raw data are validated based on a determination that the semantic types for at least the portion of data structures correspond to the data profile. 2. The computer-implemented method of claim 1 , wherein the raw data comprises at least one of transactional datasets or relational datasets. 3. The computer-implemented method of claim 1 , wherein determining the data profile is further based on at least one of column profiling, cross-column profiling, cross-table profiling, or data rule validation. 4. The computer-implemented method of claim 1 , wherein the statistical threshold for indicating the data profile is based on a distribution of characters indicated by each data element of the respective plurality of data elements for each data structure of the plurality of data structures. 5. The computer-implemented method of claim 1 , further comprising determining, based on the selected instruction, one or more machine learning-features. 6. The computer-implemented method of claim 5 , further comprising sending, to a data pipeline configured to ingest the raw data, the one or more machine learning-features, wherein the one or more machine learning-features facilitate a prediction associated with the raw data. 7. The computer-implemented method of claim 5 , wherein the one or more machine learning-features facilitate a prediction associated with new data that is different from the raw data. 8. A non-transitory computer-readable medium having instructions stored thereon that, when executed by at least one computing device, cause the at least one computing device to perform operations for determining features from raw data comprising: receiving a plurality of data structures of raw data, wherein each data structure of the plurality of data structures comprises a respective plurality of data elements; determining a data profile for the raw data based on an amount of data elements of the respective plurality of data elements for at least one data structure of the plurality of data structures satisfying a statistical threshold for indicating the data profile; and for each data structure of the plurality of data structures of the raw data: determining, based on a semantic rule that describes how to infer a semantic type from a data element of the respective plurality of data elements, the semantic type for each data structure, and selecting, based on the determined semantic type, an instruction that describes how to calculate an input feature for a machine learning model based on the respective plurality of data elements for each data structure, and wherein the semantic types for at least a portion of data structures of the raw data are validated based on a determination that the semantic types for at least the portion of data structures correspond to the data profile. 9. The non-transitory computer-readable medium of claim 8 , wherein the raw data comprises at least one of transactional datasets or relational datasets. 10. The non-transitory computer-readable medium of claim 8 , wherein determining the data profile is further based on at least one of column profiling, cross-column profiling, cross-table profiling, or data rule validation. 11. The non-transitory computer-readable medium of claim 8 , wherein the statistical threshold for indicating the data profile is based on a distribution of characters indicated by each data element of the respective plurality of data elements for each data structure of the plurality of data structures. 12. The non-transitory computer-readable medium of claim 8 , the operations further comprising determining, based on the selected instruction, one or more machine learning-features. 13. The non-transitory computer-readable medium of claim 12 , further comprising sending, to a data pipeline configured to ingest the raw data, the one or more machine learning-features, wherein the one or more machine learning-features facilitate a prediction associated with the raw data. 14. The non-transitory computer-readable medium of claim 12 , wherein the one or more machine learning-features facilitate a prediction associated with new data that is different from the raw data. 15. A system comprising: a memory; and at least one processor coupled to the memory and configured to perform operations for determining features from raw data comprising: receiving a plurality of data structures of the raw data, wherein each data structure of the plurality of data structures comprises a respective plurality of data elements; determining a data profile for the raw data based on an amount of data elements of the respective plurality of data elements for at least one data structure of the plurality of data structures satisfying a statistical threshold for indicating the data profile; and for each data structure of the plurality of data structures of the raw data: determining, based on a semantic rule that describes how to infer a semantic type from a data element of the respective plurality of data elements, the semantic type for each data structure, and selecting, based on the determined semantic type, an instruction that describes how to calculate an input feature for a machine learning model based on the respective plurality of data elements for each data structure, and wherein the semantic types for at least a portion of data structures of the raw data are validated based on a determination that the semantic types for at least the portion of data structures correspond to the data profile. 16. The system of claim 15 , wherein the raw data comprises at least one of transactional datasets or relational datasets. 17. The system of claim 15 , wherein determining the data profile is further based on at least one of column profiling, cross-column profiling, cross-table profiling, or data rule validation. 18. The system of claim 15 , wherein the statistical threshold for indicating the data profile is based on a distribution of characters indicated by each data element of the respective plurality of data elements for each data structure of the plurality of data structures. 19. The system of claim 15 , the operations further comprising determining, based on the selected instruction, one or more machine learning-features. 20. The system of claim 19 , further comprising sending, to a data pipeline configured to ingest the raw data, the one or more machine learning-features, wherein the one or more machine learnin

Assignees

Inventors

Classifications

G06N5/04Primary
Inference or reasoning models · CPC title
G06N20/00
Machine learning · CPC title
G06F16/35Primary
Clustering; Classification · CPC title

Patent family

Related publications grouped by family.

View patent family 91325279

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12436985B2 cover?: A computer-based system may engineer features based on semantic types. The computer-based system may implement deep learning algorithms and derive a domain-specific feature engineering strategy from semantic type predictions and data profiling. The computer-based system may utilize embedded domain (e.g., financial industry, etc.) knowledge to generate curated features from raw data (e.g., trans…
Who is the assignee on this patent?: American Express Travel Services Company Inc, American Express Travel Related Services Co Inc
What technology area does this patent fall under?: Primary CPC classification G06N5/04. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Oct 07 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Feature engineering based on semantic types

Learning system, learning method, and program

Multi-layer document structural info extraction framework

Optimizing feature evaluation in machine learning

System for deliverables versioning in audio mastering

Frequently asked questions