In-database data cleansing

US12360969B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12360969-B2
Application numberUS-202318495952-A
CountryUS
Kind codeB2
Filing dateOct 27, 2023
Priority dateOct 27, 2023
Publication dateJul 15, 2025
Grant dateJul 15, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A cleansing operation defined for a data structure of a database managed by a database management system is obtained. The cleansing operation is performed on data of the data structure to obtain clean data. The cleansing operation that is defined for the data structure and performed on data of the data structure is performed by the database management system.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer program product for facilitating processing within a computing environment, the computer program product comprising: at least one computer readable storage medium and program instructions collectively stored on the at least one computer readable storage medium, the program instructions collectively stored comprising: program instructions to obtain a cleansing operation defined for a data structure of a database managed by a database management system; program instructions to perform the cleansing operation on data of the data structure to obtain clean data, wherein the cleansing operation is performed by the database management system, and program instructions to store the clean data in the data structure independent of the data, wherein the data structure is defined as column-major and a column group of the data structure is stored as one or more blocks, and wherein the clean data is stored as one or more other blocks of the data structure. 2. The computer program product of claim 1 , wherein the program instructions to perform the cleansing operation perform the cleansing operation absent moving the data to a system external to the database management system to perform the cleansing operation. 3. The computer program product of claim 1 , wherein the one or more other blocks of the data structure are devoid of the data. 4. The computer program product of claim 1 , wherein the program instructions collectively stored include program instructions to access the clean data that is stored to satisfy one or more user requests. 5. The computer program product of claim 4 , wherein the one or more user requests include a plurality of user requests to be processed in parallel by the database management system. 6. The computer program product of claim 1 , wherein the data structure is a column oriented relational database. 7. The computer program product of claim 1 , wherein the data structure is a table within a database. 8. The computer program product of claim 1 , wherein the cleansing operation is an imputation operation to provide clean data for one or more positions within the data structure that are missing data values. 9. The computer program product of claim 1 , wherein the cleansing operation is selected from a group of cleansing operations consisting of: imputation of missing data, correction of data, addition of data, replacement of data, modification of data, deletion of data, transformation of data, normalization of data, standardization of data and encoding of data. 10. A computer system for facilitating processing within a computing environment, the computer system comprising: a memory; and at least one device coupled to the memory, wherein the computer system is configured to perform a method, the method comprising: obtaining a cleansing operation defined for a data structure of a database managed by a database management system; performing the cleansing operation on data of the data structure to obtain clean data, wherein the cleansing operation is performed by the database management system; and storing the clean data in the data structure independent of the data, wherein the data structure is defined as column-major, and a column group of the data structure is stored as one or more blocks, and wherein the clean data is stored as one or more other blocks of the data structure. 11. The computer system of claim 10 , wherein the performing the cleansing operation includes performing the cleansing operation absent moving the data to a system external to the database management system to perform the cleansing operation. 12. The computer system of claim 10 , wherein the one or more other blocks of the data structure are devoid of the data. 13. The computer system of claim 10 , wherein the cleansing operation is an imputation operation to provide clean data for one or more positions within the data structure that are missing data values. 14. The computer system of claim 10 , wherein the cleansing operation is selected from a group of cleansing operations consisting of: imputation of missing data, correction of data, addition of data, replacement of data, modification of data, deletion of data, transformation of data, normalization of data, standardization of data and encoding of data. 15. A computer-implemented method of facilitating processing within a computing environment, the computer-implemented method comprising: obtaining a cleansing operation defined for a data structure of a database managed by a database management system; performing the cleansing operation on data of the data structure to obtain clean data, wherein the cleansing operation is performed by the database management system; and storing the clean data in the data structure independent of the data, wherein the data structure is defined as column-major, and a column group of the data structure is stored as one or more blocks, and wherein the clean data is stored as one or more other blocks of the data structure. 16. The computer-implemented method of claim 15 , wherein the performing the cleansing operation includes performing the cleansing operation absent moving the data to a system external to the database management system to perform the cleansing operation. 17. The computer-implemented method of claim 15 , further comprising accessing the clean data that is stored to satisfy a plurality of user requests to be processed in parallel by the database management system. 18. The computer-implemented method of claim 15 , wherein the one or more other blocks of the data structure are devoid of the data. 19. The computer-implemented method of claim 15 , wherein the cleansing operation is an imputation operation to provide clean data for one or more positions within the data structure that are missing data values. 20. The computer-implemented method of claim 15 , wherein the cleansing operation is selected from a group of cleansing operations consisting of: imputation of missing data, correction of data, addition of data, replacement of data, modification of data, deletion of data, transformation of data, normalization of data, standardization of data and encoding of data.

Assignees

Inventors

Classifications

  • Ensuring data consistency and integrity · CPC title

  • G06F16/221Primary

    Column-oriented storage; Management thereof · CPC title

  • G06F16/215Primary

    Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12360969B2 cover?
A cleansing operation defined for a data structure of a database managed by a database management system is obtained. The cleansing operation is performed on data of the data structure to obtain clean data. The cleansing operation that is defined for the data structure and performed on data of the data structure is performed by the database management system.
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/221. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 15 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).