Systems, method, and computer storage medium for creating listing for items for sale in an electronic marketplace based on video analysis

US12417487B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12417487-B2
Application numberUS-202117562423-A
CountryUS
Kind codeB2
Filing dateDec 27, 2021
Priority dateDec 27, 2021
Publication dateSep 16, 2025
Grant dateSep 16, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system for assisting users in listing items for sale in an electronic marketplace is disclosed. A video is received from a user device associated with a user, the video including a video stream depicting a plurality of items to be listed for sale in the electronic marketplace. Respective images depicting respective items among the plurality of items are obtained from the video stream, and respective attributes of the respective items among the plurality of items are extracted from the video. Respective listings for sale of the respective items are generated based at least in part on the respective attributes of the respective items among the plurality of items, and the respective listings for sale of the respective items are displayed to the user.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: a processor; and memory including instructions which, when executed by the processor, cause the processor to: receive a video from a user device associated with a user, the video including a video stream and an audio stream, the video stream depicting a plurality of items to be listed in an electronic marketplace, the plurality of items being separate items and including a first item and a second item; process the audio stream of the video using a first trained machine learning model to convert at least a portion of the audio stream to text; determine a first portion of the video associated with the first item by: performing motion detection analysis of the video stream to identify a first hovering of a camera at a first portion of the video stream corresponding to the first portion of the video, and analyzing the text from a first portion of the audio stream to identify the first item, the first portion of the audio stream corresponding to the first portion of the video; determine a second portion of the video associated with the second item by: performing motion detection analysis of the video stream to identify a second hovering of the camera at a second portion of the video stream corresponding to the second portion of the video, and analyzing the text from a second portion of the audio stream to identify the second item, the second portion of the audio stream corresponding to the second portion of the video; responsive to determining the first portion of the video and the second portion of the video, generate a first listing for the first item and a second listing for the second item by: extracting a first frame from the first portion of the video stream and a second frame from the second portion of the video stream, the first frame providing a first image of the first item and the second frame providing a second image of the second item, analyzing the text using a second trained machine learning model to extract a first attribute category and value pair of an attribute of the first item from text associated with the first portion of the video, and a second attribute category and value pair of an attribute of the second item from text associated with the second portion of the video, and generating the first listing for the first item and the second listing for the second item, the first listing generated to include the first image and using the first attribute category and value pair to populate a corresponding field of the first listing, and the second listing generated to include the second image and using the second attribute category and value pair to populate a corresponding field of the second listing; and generating an electronic store, the electronic store including the first listing and the second listing, and causing the electronic store to be displayed to at least one potential buyer. 2. The system of claim 1 , wherein the first portion of the video associated with the first item is determined by: determining, based on the audio stream, a timestamp identifying a time, in the video stream, that depicts the first item among the plurality of items, and wherein the first image of the first item is extracted from the video stream based on the timestamp. 3. The system of claim 1 , wherein the instructions, when executed by the processor, cause the processor to convert the audio stream of the video to text using at least one selected from the following: i) a general purpose speech recognition model, ii) an electronic commerce language aware speech recognition model, and iii) a model trained to boost hot words associated with a product category. 4. The system of claim 3 , wherein the instructions, when executed by the processor, cause the processor to extract the first attribute category and value pair of the attribute of the first item at least by analyzing, using a named entity recognition model, the text associated with the first portion of the video corresponding to the first item. 5. The system of claim 1 , wherein the instructions, when executed by the processor, further cause the processor to generate, based on a plurality of modalities descriptive of the first item, a vector representing the first item. 6. The system of claim 5 , wherein the instructions, when executed by the processor, cause the processor to generate the vector by applying a trained multimodal model to: i) the first image of the first item; and ii) the first attribute category and value pair of the attribute of the first item. 7. The system of claim 6 , wherein the instructions, when executed by the processor, cause the processor to: search, using the vector representing the first item, a product catalogue to find one or more similar items listed in the electronic marketplace, and wherein the first listing for the first item is generated by: extracting one or more attributes of the one or more similar items; and populating one or more corresponding fields of the first listing using the one or more attributes. 8. The system of claim 1 , wherein the instructions, when executed by the processor, further cause the processor to generate an electronic marketplace store, the electronic marketplace store including the first listing and the second listing. 9. A method comprising: receiving a video from a user device associated with a user, the video including a video stream and an audio stream, the video stream depicting a plurality of items to be listed in an electronic marketplace, the plurality of items being separate items and including a first item and a second item; processing the audio stream of the video using a first trained machine learning model to convert at least a portion of the audio stream to text; determining a first portion of the video associated with the first item by: performing motion detection analysis of the video stream to identify a first hovering of a camera at a first portion of the video stream corresponding to the first portion of the video, and analyzing the text from a first portion of the audio stream to identify the first item, the first portion of the audio stream corresponding to the first portion of the video; determining a second portion of the video associated with the second item by: performing motion detection analysis of the video stream to identify a second hovering of the camera at a second portion of the video stream corresponding to the second portion of the video, and analyzing the text from a second portion of the audio stream to identify the second item, the second portion of the audio stream corresponding to the second portion of the video; responsive to determining the first portion of the video and the second portion of the video, generating a first listing for the first item and a second listing for the second item by: extracting a first frame from the first portion of the video stream and a second frame from the second portion of the video stream, the first frame providing a first image of the first item and the second frame providing a second image of the second item, analyzing the text using a second trained machine learning model to extract a first attribute category and value pair of an attribute of the first item from text associated with the first portion of the video, and a second attribute category and value pair of an attribute of the second item from text associated with the second portion of the video, and generating the first listing for the first item and the second listing for the second item, the first listing generated to include the first image and using the first attribute category and value pair to populate a corresponding field of the first listing, and the second listing generated to include the second image

Assignees

Inventors

Classifications

  • using intermediate agents · CPC title

  • Extraction of image or video features · CPC title

  • Named entity recognition · CPC title

  • Advertisement creation · CPC title

  • graphically representing goods, e.g. 3D product representation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12417487B2 cover?
A system for assisting users in listing items for sale in an electronic marketplace is disclosed. A video is received from a user device associated with a user, the video including a video stream depicting a plurality of items to be listed for sale in the electronic marketplace. Respective images depicting respective items among the plurality of items are obtained from the video stream, and res…
Who is the assignee on this patent?
Ebay Inc
What technology area does this patent fall under?
Primary CPC classification G06Q30/0643. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 16 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).