What technology area does this patent fall under?

Primary CPC classification G06F16/35. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Sep 09 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Automatic lot classification

US12411884B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12411884-B2
Application number	US-202418641994-A
Country	US
Kind code	B2
Filing date	Apr 22, 2024
Priority date	Mar 8, 2018
Publication date	Sep 9, 2025
Grant date	Sep 9, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and media for lot classification are disclosed. In one example, a classification system for identifying lot listings receives a description for a listing in a publication system, identifies a string in the listing, identifies a quantity word or digit in the string, and converts an identified quantity word into digit form. A normalized string is tokenized to produce tokens, the tokenizing of the normalized string including splitting the normalized string into a series of substrings using a sequence of delimiters. For each substring, an additional split is performed by separating any digit from any other adjacent character, unless that character is another digit, and maintaining an internal character order of each split substring to produce a flattened list of tokenized tokens.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer implemented tokenization method comprising: receiving a normalized title string; tokenizing the normalized title string by: splitting, by one or more processors, the normalized title string into a plurality of substrings using a sequence of whitespaces as a delimiter; for each substring of the plurality of substrings, performing an additional split in each substring of the plurality of substrings where a digit character is separated from a non-digit character to create split substrings for each of the substrings of the plurality of substrings to create a plurality of tokens for the normalized title string, an individual substring of the plurality of substrings comprising the digit character adjacent to the non-digit character without the delimiter between the digit character and the non-digit character, the performing the additional split comprising processing the individual substring by a machine learning model to split the digit character from the non-digit character to create the two tokens of the plurality of tokens; and maintaining an order of the split substrings; and creating a flattened list of the plurality of tokens for the normalized title string. 2. The tokenization method of claim 1 , wherein the normalized title string is normalized from a non-normalized text string and the method includes normalizing the non-normalized text string by: converting non-digit characters that are uppercase non-digit characters to lowercase non-digit characters; determining that one of the non-digit characters of the non-normalized text string corresponds to quantity words; and converting the non-digit characters that correspond to quantity words to digit characters. 3. The tokenization method of claim 2 , wherein others of the non-digit characters of the non-normalized text string correspond to an item title in a listing for an item. 4. The tokenization method of claim 1 , further comprising assigning a probability to a token of the plurality of tokens, the probability being indicative of a lot quantity, the machine learning model trained by performing training operations comprising: receiving a training set of listing titles having assigned lot size values greater than one; for each listing title in the training set: preprocessing the listing title to identify numerical tokens; computing a feature vector for each numerical token; assigning a positive label to the numerical token if the numerical token equals the assigned lot size value for the listing title; assigning a negative label to the numerical token if the numerical token does not equal the assigned lot size value for the listing title; and training a logistic regression binary classifier using the computed feature vectors and assigned labels to generate a trained model for identifying lot quantities. 5. The tokenization method of claim 4 , wherein the feature vector includes one or more of: a token after vector indicating a token following the numerical token, a bigram after vector, a token before vector indicating a token preceding the numerical token, a bigram before vector, a unit of measure vector, a token position ratio indicating a ratio of the numerical token's position to a length of the listing title, and a token divisibility vector, and wherein the probability is based on a position of the token in the normalized title string and the method further comprises classifying a listing associated with the normalized title string as a lot listing based on the probability. 6. The tokenization method of claim 4 , wherein the order is an internalized order of the split substrings. 7. The tokenization method of claim 1 , wherein performing the additional split in each substring of the plurality of substrings comprises separating a character from an adjacent character based on a difference between the character and the adjacent character. 8. A system, comprising: at least one processor; and a memory device storing instructions which, when executed by the at least one processor, causes the system to perform operations comprising: receiving a normalized title string; tokenizing the normalized title string by: splitting the normalized title string into a plurality of substrings using a sequence of whitespaces as a delimiter; for each substring of the plurality of substrings, performing an additional split in each substring of the plurality of substrings where a digit character is separated from a non-digit character to create split substrings for each of the substrings of the plurality of substrings to create a plurality of tokens for the normalized title string, an individual substring of the plurality of substrings comprising the digit character adjacent to the non-digit character without the delimiter between the digit character and the non-digit character, the performing the additional split comprising processing the individual substring by a machine learning model to split the digit character from the non-digit character to create the two tokens of the plurality of tokens; and maintaining an order of the split substrings; and creating a flattened list of the plurality of tokens for the normalized title string. 9. The system of claim 8 , wherein the normalized title string is normalized from a non-normalized text string and the processor, when executing the instructions, causes the system to perform operations comprising: converting non-digit characters that are uppercase non-digit characters to lowercase non-digit characters; determining that one of the non-digit characters of the non-normalized text string corresponds to quantity words; and converting the non-digit characters that correspond to quantity words to digit characters. 10. The system of claim 9 , wherein others of the non-digit characters of the non-normalized text string correspond to an item title in a listing for an item. 11. The system of claim 8 , the processor, when executing the instructions, causes the system to perform operations comprising assigning a probability to a token of the plurality of tokens, the probability being indicative of a lot quantity. 12. The system of claim 11 , wherein the probability is based on a position of the token in the normalized title string and the processor, when executing the instructions, causes the system to perform operations comprising classifying a listing associated with the normalized title string as a lot listing based on the probability. 13. The system of claim 11 , wherein the order is an internalized order of the split substrings. 14. The system of claim 8 , wherein when performing the additional split in each substring of the plurality of substrings the processor, when executing the instructions, causes the system to perform operations comprising separating a character from an adjacent character based on a difference between the character and the adjacent character. 15. A non-transitory computer-readable medium comprising instructions which, when read by a machine, cause the machine to perform operations comprising: receiving a normalized title string; tokenizing the normalized title string by: splitting the normalized title string into a plurality of substrings using a sequence of whitespaces as a delimiter; for each substring of the plurality of substrings, performing an additional split in each substring of the plurality of substrings where a digit character is separated from a non-digit character to create split substrings for each of the substrings of the plurality of substrings to create a plurality of tokens for the normalized title string, an individual substring of the pl

Assignees

Ebay Inc

Inventors

Classifications

G06F16/35Primary
Clustering; Classification · CPC title
G06N20/00
Machine learning · CPC title
G06F40/151
Transformation · CPC title
G06F40/163
Handling of whitespace · CPC title
G06F40/279
Recognition of textual entities · CPC title

Patent family

Related publications grouped by family.

View patent family 65576731

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12411884B2 cover?: Methods, systems, and media for lot classification are disclosed. In one example, a classification system for identifying lot listings receives a description for a listing in a publication system, identifies a string in the listing, identifies a quantity word or digit in the string, and converts an identified quantity word into digit form. A normalized string is tokenized to produce tokens, the…
Who is the assignee on this patent?: Ebay Inc
What technology area does this patent fall under?: Primary CPC classification G06F16/35. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Sep 09 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Deep neural network-based relationship analysis with multi-feature token model

Generating text snippets using supervised machine learning algorithm

Snippet extractor: recurrent neural networks for text summarization at industry scale

Machine Learning System

System and method for topic extraction and opinion mining

Automatic item categorizer

System and method for providing automatic high-value listing feeds for online computer users

Frequently asked questions