Automatic lot classification

US12411884B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12411884-B2
Application numberUS-202418641994-A
CountryUS
Kind codeB2
Filing dateApr 22, 2024
Priority dateMar 8, 2018
Publication dateSep 9, 2025
Grant dateSep 9, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and media for lot classification are disclosed. In one example, a classification system for identifying lot listings receives a description for a listing in a publication system, identifies a string in the listing, identifies a quantity word or digit in the string, and converts an identified quantity word into digit form. A normalized string is tokenized to produce tokens, the tokenizing of the normalized string including splitting the normalized string into a series of substrings using a sequence of delimiters. For each substring, an additional split is performed by separating any digit from any other adjacent character, unless that character is another digit, and maintaining an internal character order of each split substring to produce a flattened list of tokenized tokens.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer implemented tokenization method comprising: receiving a normalized title string; tokenizing the normalized title string by: splitting, by one or more processors, the normalized title string into a plurality of substrings using a sequence of whitespaces as a delimiter; for each substring of the plurality of substrings, performing an additional split in each substring of the plurality of substrings where a digit character is separated from a non-digit character to create split substrings for each of the substrings of the plurality of substrings to create a plurality of tokens for the normalized title string, an individual substring of the plurality of substrings comprising the digit character adjacent to the non-digit character without the delimiter between the digit character and the non-digit character, the performing the additional split comprising processing the individual substring by a machine learning model to split the digit character from the non-digit character to create the two tokens of the plurality of tokens; and maintaining an order of the split substrings; and creating a flattened list of the plurality of tokens for the normalized title string. 2. The tokenization method of claim 1 , wherein the normalized title string is normalized from a non-normalized text string and the method includes normalizing the non-normalized text string by: converting non-digit characters that are uppercase non-digit characters to lowercase non-digit characters; determining that one of the non-digit characters of the non-normalized text string corresponds to quantity words; and converting the non-digit characters that correspond to quantity words to digit characters. 3. The tokenization method of claim 2 , wherein others of the non-digit characters of the non-normalized text string correspond to an item title in a listing for an item. 4. The tokenization method of claim 1 , further comprising assigning a probability to a token of the plurality of tokens, the probability being indicative of a lot quantity, the machine learning model trained by performing training operations comprising: receiving a training set of listing titles having assigned lot size values greater than one; for each listing title in the training set: preprocessing the listing title to identify numerical tokens; computing a feature vector for each numerical token; assigning a positive label to the numerical token if the numerical token equals the assigned lot size value for the listing title; assigning a negative label to the numerical token if the numerical token does not equal the assigned lot size value for the listing title; and training a logistic regression binary classifier using the computed feature vectors and assigned labels to generate a trained model for identifying lot quantities. 5. The tokenization method of claim 4 , wherein the feature vector includes one or more of: a token after vector indicating a token following the numerical token, a bigram after vector, a token before vector indicating a token preceding the numerical token, a bigram before vector, a unit of measure vector, a token position ratio indicating a ratio of the numerical token's position to a length of the listing title, and a token divisibility vector, and wherein the probability is based on a position of the token in the normalized title string and the method further comprises classifying a listing associated with the normalized title string as a lot listing based on the probability. 6. The tokenization method of claim 4 , wherein the order is an internalized order of the split substrings. 7. The tokenization method of claim 1 , wherein performing the additional split in each substring of the plurality of substrings comprises separating a character from an adjacent character based on a difference between the character and the adjacent character. 8. A system, comprising: at least one processor; and a memory device storing instructions which, when executed by the at least one processor, causes the system to perform operations comprising: receiving a normalized title string; tokenizing the normalized title string by: splitting the normalized title string into a plurality of substrings using a sequence of whitespaces as a delimiter; for each substring of the plurality of substrings, performing an additional split in each substring of the plurality of substrings where a digit character is separated from a non-digit character to create split substrings for each of the substrings of the plurality of substrings to create a plurality of tokens for the normalized title string, an individual substring of the plurality of substrings comprising the digit character adjacent to the non-digit character without the delimiter between the digit character and the non-digit character, the performing the additional split comprising processing the individual substring by a machine learning model to split the digit character from the non-digit character to create the two tokens of the plurality of tokens; and maintaining an order of the split substrings; and creating a flattened list of the plurality of tokens for the normalized title string. 9. The system of claim 8 , wherein the normalized title string is normalized from a non-normalized text string and the processor, when executing the instructions, causes the system to perform operations comprising: converting non-digit characters that are uppercase non-digit characters to lowercase non-digit characters; determining that one of the non-digit characters of the non-normalized text string corresponds to quantity words; and converting the non-digit characters that correspond to quantity words to digit characters. 10. The system of claim 9 , wherein others of the non-digit characters of the non-normalized text string correspond to an item title in a listing for an item. 11. The system of claim 8 , the processor, when executing the instructions, causes the system to perform operations comprising assigning a probability to a token of the plurality of tokens, the probability being indicative of a lot quantity. 12. The system of claim 11 , wherein the probability is based on a position of the token in the normalized title string and the processor, when executing the instructions, causes the system to perform operations comprising classifying a listing associated with the normalized title string as a lot listing based on the probability. 13. The system of claim 11 , wherein the order is an internalized order of the split substrings. 14. The system of claim 8 , wherein when performing the additional split in each substring of the plurality of substrings the processor, when executing the instructions, causes the system to perform operations comprising separating a character from an adjacent character based on a difference between the character and the adjacent character. 15. A non-transitory computer-readable medium comprising instructions which, when read by a machine, cause the machine to perform operations comprising: receiving a normalized title string; tokenizing the normalized title string by: splitting the normalized title string into a plurality of substrings using a sequence of whitespaces as a delimiter; for each substring of the plurality of substrings, performing an additional split in each substring of the plurality of substrings where a digit character is separated from a non-digit character to create split substrings for each of the substrings of the plurality of substrings to create a plurality of tokens for the normalized title string, an individual substring of the pl

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12411884B2 cover?
Methods, systems, and media for lot classification are disclosed. In one example, a classification system for identifying lot listings receives a description for a listing in a publication system, identifies a string in the listing, identifies a quantity word or digit in the string, and converts an identified quantity word into digit form. A normalized string is tokenized to produce tokens, the…
Who is the assignee on this patent?
Ebay Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/35. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 09 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).