Automated generation of pre-labeled training data
US-2018189951-A1 · Jul 5, 2018 · US
US12346370B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12346370-B2 |
| Application number | US-201816036224-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 16, 2018 |
| Priority date | Jul 16, 2018 |
| Publication date | Jul 1, 2025 |
| Grant date | Jul 1, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Representative embodiments disclose mechanisms to perform visual intent classification or visual intent detection or both on an image. Visual intent classification utilizes a trained machine learning model that classifies subjects in the image according to a classification taxonomy. The visual intent classification can be used as a pre-triggering mechanism to initiate further action in order to substantially save processing time. Example further actions include user scenarios, query formulation, user experience enhancement, and so forth. Visual intent detection utilizes a trained machine learning model to identify subjects in an image, place a bounding box around the image, and classify the subject according to the taxonomy. The trained machine learning model utilizes multiple feature detectors, multi-layer predictions, multilabel classifiers, and bounding box regression.
Opening claim text (preview).
What is claimed is: 1. A computer implemented method, comprising: receiving an image as a query at a computer-implemented search engine, wherein the image includes an object; in response to receiving the image as the query, submitting the image to a multilabel classifier of the computer-implemented search engine, where the multilabel classifier is configured to: identify a plurality of objects in the image; place bounding boxes in the image, where each of the bounding boxes substantially bounds a corresponding object; and assign at least one classification label to each bounding box to identify the corresponding objects in the images; passing at least one classification label and an associated bounding box to a trained suppression model of the computer-implemented search engine, the trained suppression model computing scores for the bounding box and suppressing at least one classification label along with its associated bounding box based upon the scores; based on an unsuppressed classification label, selecting, by the computer-implemented search engine, a user intent scenario from amongst a predefined set of user intent scenarios, wherein the user intent scenario in the predefined set of user intent scenarios is selectable due to the classification label being assigned to the user intent scenario; generating, by the computer-implemented search engine, a query suggestion for review by a user who issued the image as the query, wherein the query suggestion is generated based upon the selected user intent scenario; subsequent to generating the query suggestion, receiving, by the computer-implemented search engine, an indication that the query suggestion has been selected by the user; and providing, by the computer-implemented search engine, output that is based upon the user intent scenario. 2. The method of claim 1 wherein the multilabel classifier comprises a MobileNet backbone trained using an error function comprising two multilabel classification losses, a first multilabel classification loss being a multilabel elementwise sigmoid loss and a second multilabel classification loss being a multilabel softmax loss. 3. The method of claim 1 wherein the multilabel classifier is trained using a cross-entropy loss given by E = - 1 n ∑ n = 1 N [ p n log p ^ n + ( 1 - p n ) log ( 1 - p ^ n ) ] . 4. The method of claim 1 wherein the user intent scenario is a visual search, the method further comprising outputting images that have the classification label assigned thereto. 5. The method of claim 1 wherein multiple classification labels for the image are received from the multilabel classifier, the method further comprising selecting the classification label from amongst the multiple classification labels based upon a determination that the object is of interest to the user. 6. The method of claim 1 wherein the user intent scenario is performance of a search, wherein a query that includes the classification is constructed, the method further comprising outputting search results identified based upon the query. 7. A computing system comprising: a processor; and memory storing instructions that, when executed by the processor, cause the processor to perform acts comprising: receiving an image as a query by a computer-implemented search engine, wherein the image includes an object; in response to receiving the image as the query, submitting the image to a multilabel classifier, where the multilabel classifier is configured to: identify a plurality of objects in the image; place bounding boxes in the image, where each of the bounding boxes substantially bounds a corresponding object; and assign at least one classification label to each bounding box to identify the corresponding objects in the images; passing classification labels and associated bounding boxes to a trained suppression model, where the trained suppression model computes scores for the bounding boxes and the associated classification labels and suppresses a classification label and its associated bounding box based upon the scores; based on an unsuppressed classification label, selecting a user intent scenario from amongst a predefined set of user intent scenarios, wherein the user intent scenario in the predefined set of user intent scenarios is selectable due to the unsuppressed classification label being assigned to the user intent scenario; generating a query suggestion for review by a user who issued the image as the query, wherein the query suggestion is generated based upon the selected user intent scenario; subsequent to generating the query suggestion, receiving, by the search engine, an indication that the query suggestion has been selected by the user; and providing, by the search engine, output that is based upon the user intent scenario. 8. The computing system of claim 7 wherein the multilabel classifier comprises a MobileNet backbone trained using an error function comprising two multilabel classification losses, a first multilabel classification loss being a multilabel elementwise sigmoid loss and a second multilabel classification loss being a multilabel sof
Convolutional networks [CNN, ConvNet] · CPC title
Supervised learning · CPC title
Classification techniques · CPC title
into predefined classes · CPC title
Learning methods · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.