What technology area does this patent fall under?

Primary CPC classification G06F40/56. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu May 12 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Automated Caption Generation from a Dataset

US2022147708A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2022147708-A1
Application number	US-202017094435-A
Country	US
Kind code	A1
Filing date	Nov 10, 2020
Priority date	Nov 10, 2020
Publication date	May 12, 2022
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A dataset captioning system is described that generates captions of text to describe insights identified from a dataset, automatically and without user intervention. To do so, given an input of a dataset the dataset captioning system determines which data insights are likely to support potential visualizations of the dataset, generates text based on these insights, orders the text, processes the ordered text for readability, and then outputs the text as a caption. These techniques also include adjustments made to the complexity of the text, globalization of the text, inclusion of links to outside sources of information, translation of the text, and so on as part of generating the caption.

First claim

Opening claim text (preview).

1 . In a digital medium automated caption generation environment, a method implemented by a computing device, the method comprising: generating, by the computing device automatically and without user intervention, a caption that textually describes a dataset having a plurality of data entries organized as a plurality of data subsets, the generating including: determining which datatypes are included in the plurality of data subsets, respectively; identifying a composition of the dataset based the datatypes; determining which data insights correspond to the composition; generating text, based on the determined data insights, from the plurality of data entries of the dataset; forming the caption based at least in part on the text. 2 . The method as described in claim 1 , wherein the forming includes: generating scores based on the text generated for the data insights; and ranking the text generated for the data insights based on the scores. 3 . The method as described in claim 2 , wherein the forming of the caption includes ordering the text based on the ranking. 4 . The method as described in claim 2 , wherein the scores quantify the text corresponding to the data insights based on degrees of specificity. 5 . The method as described in claim 1 , wherein the plurality of datatypes includes quantitative, nominal, ordinal, temporal, or semantic. 6 . The method as described in claim 1 , wherein the data insights include anomaly, cyclic pattern, derived value, relative value, threshold amount of change, or extremes based on a minimum amount or a maximum amount. 7 . The method as described in claim 1 , wherein the forming of the caption includes adjusting language complexity of the text. 8 . The method as described in claim 1 , wherein the forming of the caption includes editing text generated for a first said data insight based on text generated for a second said data insight as part of the caption. 9 . The method as described in claim 1 , wherein the forming of the caption includes generating a link included as part of the caption, the link generated based on at least a portion of the text and is user selectable to navigate to a network address. 10 . The method as described in claim 1 , wherein the identifying of the composition is based on which combination of the datatypes is included in the dataset. 11 . The method as described in claim 10 , wherein the composition is: temporal based on inclusion of a temporal datatype and a quantitative datatype as part of the datatypes of the plurality of data subsets; or segment comparison based on inclusion of a quantitative datatype and a quantitative datatype as part of the datatypes of the plurality of data subsets. 12 . The method as described in claim 1 , further comprising receiving a user input specifying the dataset via a user interface, the dataset including a portion of a table of a larger dataset in a user interface and the data subsets are configured as rows or columns of the table. 13 . In a digital medium automated caption generation environment, a system comprising: a dataset input module implemented at least partially in hardware of a computing device to receive a dataset having a plurality of data entries: a text generation module implemented at least partially in hardware of the computing device to generate text based on a plurality of data insights from the plurality of data entries of the dataset; and a caption formation module implemented at least partially in hardware of the computing device to generate a caption based on the text, the caption formation module including: a score generation module to generate scores corresponding to the data insights, respectively; a ranking module configured to rank the text based on the scores corresponding to respective said data insights; and a text ordering module configured to order the text as part of the caption based on respective said scores. 14 . The system as described in claim 13 , wherein the scores quantify the text based on degrees of specificity. 15 . The system as described in claim 13 , wherein the caption formation module further comprises a complexity adjustment module configured to adjust language complexity of the text as part of the caption. 16 . The system as described in claim 13 , wherein the caption formation module further comprises a readability module to edit the text generated for a first said data insight based on text generated for a second said data insight. 17 . The system as described in claim 13 , wherein the caption formation module further comprises a readability module to edit the text for safety. 18 . The system as described in claim 13 , wherein the caption formation module further comprises: a link generation module configured to generate a link as part of the caption, the link generated based on at least a portion of the text and is user selectable to navigate to a network address; and a translation module configured to translate the text. 19 . In a digital medium automated caption generation environment, a system comprising: means for generating, automatically and without user intervention, a caption that textually describes a dataset having a plurality of data entries, the generating means including: means for receiving a dataset having a plurality of data entries: means for generating text based on a plurality of data insights from the plurality of data entries of the dataset; means for ordering the text based on a ranking; and means for editing the ordered text for readability such that text generated for a first said data insight is edited based on text generated for a second said data insight. 20 . The system as described in claim 19 , further comprising: means for adjusting language complexity of the text as part of the caption; means for checking safety of the text as part of the caption; means for translating the text as part of the caption; or means for generating a link included as part of the caption, the link generated based on at least a portion of the text and is user selectable to navigate to a network address.

Assignees

Adobe Inc

Inventors

Classifications

G06F40/58
Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation · CPC title
G06F40/295
Named entity recognition · CPC title
G06F40/169
Annotation, e.g. comment data or footnotes · CPC title
G06F40/56Primary
Natural language generation · CPC title
G06F40/216Primary
using statistical methods · CPC title

Patent family

Related publications grouped by family.

View patent family 81453437

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2022147708A1 cover?: A dataset captioning system is described that generates captions of text to describe insights identified from a dataset, automatically and without user intervention. To do so, given an input of a dataset the dataset captioning system determines which data insights are likely to support potential visualizations of the dataset, generates text based on these insights, orders the text, processes th…
Who is the assignee on this patent?: Adobe Inc
What technology area does this patent fall under?: Primary CPC classification G06F40/56. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu May 12 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Descriptive insight generation and presentation system

Data reporting system and method

Associating insights with data

Automatic recognition and insights of data

Frequently asked questions