What technology area does this patent fall under?

Primary CPC classification G06F16/2365. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Dec 16 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Hierarchical delimiter identification for parsing of raw data

Patent metadata
Field	Value
Publication number	US-12499102-B2
Application number	US-202318218986-A
Country	US
Kind code	B2
Filing date	Jul 6, 2023
Priority date	Jul 6, 2023
Publication date	Dec 16, 2025
Grant date	Dec 16, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An example system for parsing and transforming input data that includes processing circuitry and memory, the memory configured to store the input data. The processing circuitry is configured to determine a first delimiter in the input data. The processing circuitry is configured to determine a plurality of second delimiter hypotheses and parse the input data according to the first delimiter and the plurality of second delimiter hypotheses to generate a plurality of tables that are each associated with a respective one of the plurality of second delimiter hypotheses. The processing circuitry is configured to determine a respective consistency score for each of the plurality of tables and select a table from among the plurality of tables based on the respective consistency score associated with the table. The processing circuitry is configured to format the input data based on the selected table to generate formatted data and output the formatted data.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computing system comprising: one or more processors; and one or more memories storing processor-executable instructions that, when executed by the one or more processors, cause the one or more processors to perform operations comprising: obtaining input data; determining a first delimiter within the input data; generating, from the input data and using the first delimiter, an arranged string; determining an N most frequent characters in the arranged string, wherein Nis greater than one; determining a plurality of second delimiter hypotheses, the plurality of second delimiter hypotheses comprising the N most frequent characters in the arranged string, each of the plurality of second delimiter hypotheses comprising a respective potential delimiter to use with the first delimiter; parsing the arranged string using each of the plurality of second delimiter hypotheses; generating, based on parsing the arranged string using each of the plurality of second delimiter hypotheses, a plurality of tables that are each associated with a respective one of the plurality of second delimiter hypotheses; determining a respective consistency score associated with each respective table of the plurality of tables, wherein the respective consistency score is based on at least one of a total number of patterns in the respective table, a total number of tuples in the respective table, a total number of delimiters per pattern in the respective table, a total number of columns or rows in the respective table, or a total number cells defined by the rows and the columns of the respective table filled by values; selecting a table from among the plurality of tables based on the respective consistency score associated with the table; formatting the input data based on the selected table to generate formatted data; and outputting the formatted data. 2 . The computing system of claim 1 , wherein formatting the input data comprises formatting the input data into payload data labeled according to a column and a row of the selected table. 3 . The computing system of claim 1 , wherein selecting the table from among the plurality of tables comprises selecting the table having a highest consistency score among the plurality of tables. 4 . The computing system of claim 1 , wherein the operations further comprise: determining a plurality of child delimiter hypotheses within the input data for a column or a row of the selected table, the column including column data or the row including row data, each of the plurality of child delimiter hypotheses comprising a respective potential child delimiter; parsing the column data or the row data according to each of the plurality of child delimiter hypotheses; generating, based on parsing the column data or the row data, a plurality of child tables, each of the plurality of child tables associated with a respective one of the plurality of child delimiter hypotheses; determining a respective child consistency score for each of the plurality of child delimiter hypotheses; and selecting a child delimiter hypothesis from among the plurality of child delimiter hypotheses or no child delimiter hypothesis for the column or the row based on the respective child consistency score for each of the plurality of child delimiter hypotheses. 5 . The computing system of claim 4 , wherein selecting the child delimiter hypothesis from among the plurality of child delimiter hypotheses or no child delimiter hypothesis for the column or the row comprises selecting no child delimiter hypothesis based on the respective child consistency score for each of the plurality of child delimiter hypotheses being equal to zero. 6 . The computing system of claim 4 , wherein selecting the child delimiter hypothesis from among the plurality of child delimiter hypotheses or no child delimiter hypothesis for the column or the row comprises selecting the child delimiter hypothesis based on the child delimiter hypothesis having a highest consistency score among the respective child consistency score of each of the plurality of child delimiter hypotheses for the column or the row. 7 . The computing system of claim 1 , wherein the first delimiter comprises a row delimiter, the plurality of second delimiter hypotheses comprises a plurality of column delimiter hypotheses, and wherein determining the respective consistency score comprises determining: P ⁡ ( x , θ ) = 1 k ⁢ ∑ k = 1 k ⁢ N k ( M k ( M k + 1 ) ) * ( M col R ⁢ C filled ) , where P is a function yielding the respective consistency score, x is a block of input text, θ is a hypothetical delimiter applied to the input text, k is a total number of unique patterns found while processing for θ, N k is a total number of tuples, M k is a total number of delimiters per pattern, M col is a total number of columns created and RC filled is a total number of rows and columns filled by values. 8 . The computing system of claim 1 , wherein the operations further comprise: determining a second delimiter hypothesis as a respective one of the plurality of second delimiter hypotheses associated with the selected table; and outputting at least two of the first delimiter, the second delimiter hypothesis, or a child delimiter hypothesis. 9 . The computing system of claim 1 , wherein determining the first delimiter comprises determining that the first delimiter comprises an only potential first delimiter from a plurality of potential first delimiters to appear in input text. 10 . The computing system of claim 1 , wherein determining the first delimiter comprises determining the firs

Assignees

Unitedhealth Group Inc

Inventors

Classifications

G06F16/2365Primary
Ensuring data consistency and integrity · CPC title
G06F16/2282Primary
Tablespace storage structures; Management thereof · CPC title

Patent family

Related publications grouped by family.

View patent family 94175381

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12499102B2 cover?: An example system for parsing and transforming input data that includes processing circuitry and memory, the memory configured to store the input data. The processing circuitry is configured to determine a first delimiter in the input data. The processing circuitry is configured to determine a plurality of second delimiter hypotheses and parse the input data according to the first delimiter and…
Who is the assignee on this patent?: Unitedhealth Group Inc
What technology area does this patent fall under?: Primary CPC classification G06F16/2365. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Dec 16 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).