Automated artificial intelligence driven readability scoring techniques

US12518094B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12518094-B2
Application numberUS-202318346939-A
CountryUS
Kind codeB2
Filing dateJul 5, 2023
Priority dateMay 18, 2022
Publication dateJan 6, 2026
Grant dateJan 6, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A data processing system implements obtaining a first textual content, segmenting the first textual content into a plurality of first segments, and providing each segment of the plurality of first segments to a first natural language processing (NLP) model to obtain a set of first readability scores for the plurality of first segments. The first NLP model is configured to analyze a textual input and to output a readability score representing a measurement of readability of the textual input. The system further implements aggregating the set of first segment readability scores to determine a first readability score for the first textual content, and perform at least one of causing the first readability score to be presented to a user or performing one or more actions on the first textual content based on the readability score.

First claim

Opening claim text (preview).

What is claimed is: 1 . A data processing system comprising: a processor; and a machine-readable storage medium storing executable instructions that, when executed, cause the processor alone or in combination with other processors to perform operations comprising: accessing first training data for a language model for fine-tuning a pretrained language model (PLM) to analyze textual inputs, determine readability scores for the textual inputs, and output readability scores, the PLM having been pretrained on a corpus of second training data to analyze textual inputs and having not been pretrained to determine the readability scores for the textual inputs; fine-tuning training of the PLM using the first training data to enable the PLM to receive a textual input, generate a readability score for the textual input, and output the readability score for the textual input; obtaining first aggregated readability scores generated by analyzing an output of a first version of a language model using the PLM and aggregating first readability scores output by the PLM; obtaining second aggregated readability scores generated by analyzing the output of a second version of the language model using the PLM and aggregating second readability scores output by the PLM; and selectively utilizing the first version of the language model or the second version of the language model to analyze textual content based on the first aggregated readability scores and the second aggregated readability scores. 2 . The data processing system of claim 1 , wherein the machine-readable storage medium includes instructions configured to cause the processor alone or in combination with other processors to perform operations of: accessing second training data that includes training data for training the PLM to recognize domain-specific terminology when determining the readability score of the textual input; and fine-tuning training of the PLM using second training data. 3 . The data processing system of claim 1 , wherein the machine-readable storage medium includes instructions configured to cause the processor alone or in combination with other processors to perform operations of: receiving, from a first application on a first client device via a first network connection, a request to provide a first readability score for a first textual content from the first application; analyzing the first textual content with the PLM to obtain a first readability score for the first textual content; performing one or more actions on the first textual content based on the first readability score for the first textual content; and prior to analyzing the first textual content to the PLM, segmenting the first textual content into a plurality of first segments using a second language model trained to recognize one or more segment boundaries in the first textual content and to output a plurality of first segments of textual content. 4 . The data processing system of claim 3 , wherein analyzing the first textual content with the PLM to obtain the first readability score for the first textual content further comprises: analyzing each segment of the plurality of first segments of the first textual content with the PLM to obtain a plurality of segment readability scores, wherein each readability score of the plurality of segment readability scores is associated with a segment of the plurality of first segments; and aggregating the plurality of segment readability scores to determine the first readability score for the first textual content. 5 . The data processing system of claim 4 , wherein aggregating the plurality of segment readability scores further comprises: determining the first readability score based on an average of the plurality of segment readability scores for the plurality of first segments. 6 . The data processing system of claim 1 , wherein to obtain the first aggregated readability scores generated by analyzing the output of the first version of a language model using the PLM and aggregating the first readability scores output by the PLM, the machine-readable storage medium includes instructions configured to cause the processor alone or in combination with other processors to perform operations of: providing a set of reference input data to the first version of the language model to obtain a set of first reference textual content output by the first version of the language model; analyzing the set of first reference textual content output by the first version of the language model using the PLM to obtain a set of first reference readability scores for the set of reference input data; and aggregating the set of first reference readability scores for the set of reference input data to generate the first aggregated readability scores for the first version of the language model. 7 . The data processing system of claim 6 , wherein to obtain the second aggregated readability scores generated by analyzing the output of the second version of the language model using the PLM and aggregating the second readability scores output by the PLM, the machine-readable storage medium includes instructions configured to cause the processor alone or in combination with other processors to perform operations of: providing the set of reference input data to the second version of the language model to obtain a set of second reference textual content output by the second version of the language model; analyzing the set of second reference textual content output by the second version of the language model using the PLM to obtain a set of second reference readability scores for the set of reference input data; and aggregating the set of second reference readability scores for the set of reference input data to generate a second aggregated readability score for the second version of the language model. 8 . The data processing system of claim 7 , wherein the machine-readable storage medium includes instructions configured to cause the processor alone or in combination with other processors to perform operations of: comparing the first aggregated readability scores with the second aggregated readability score; and updating the first version of the language model with the second version of the language model responsive to the second aggregated readability score exceeding the first aggregated readability scores. 9 . The data processing system of claim 1 , wherein the machine-readable storage medium includes instructions configured to cause the processor alone or in combination with other processors to perform operations of: receiving, from a first application on a first client device via a first network connection, a request to provide a first readability score for a first textual content from the first application; analyzing the first textual content with the PLM to obtain a first readability score for the first textual content; performing one or more actions on the first textual content based on the first readability score for the first textual content; and causing the first application of the first client device to present the first readability score on a user interface of the first application. 10 . The data processing system of claim 9 , wherein the machine-readable storage medium includes instructions configured to cause the processor alone or in combination with other processors to perform operations of: generating second textual content based on the first textual content using a readability language model trained to receive the first textual content as an input and to output the second textual content, the second textual content having a second readability score higher than the first readability score; and causing the first applica

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12518094B2 cover?
A data processing system implements obtaining a first textual content, segmenting the first textual content into a plurality of first segments, and providing each segment of the plurality of first segments to a first natural language processing (NLP) model to obtain a set of first readability scores for the plurality of first segments. The first NLP model is configured to analyze a textual inpu…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F40/253. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 06 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).