Using unsupervised machine learning to identify attribute values as related to an input

US12333501B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12333501-B2
Application numberUS-202217935060-A
CountryUS
Kind codeB2
Filing dateSep 23, 2022
Priority dateSep 23, 2022
Publication dateJun 17, 2025
Grant dateJun 17, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Technologies for skill taxonomy management are described. Embodiments include extracting an input text from an online system and applying an unsupervised generative text machine learning model to the input text. The text generator generates a set of sentences based on a job title included in the input text. One or more skills are extracted from the set of sentences. The extracted one or more skills correspond to one or more skills in a skill taxonomy. A frequency distribution is generated over the extracted one or more skills. The one or more skills are ranked based on the frequency distribution. Based on the ranking, a subset of the extracted one or more skills is generated. The subset of the extracted one or more skills is provided to a downstream operation, process, or service of the online system.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: extracting an input text from an online system, the input text comprising a job title; applying an unsupervised generative text machine learning model to the input text; generating, by the generative text machine learning model, a plurality of sentences based on the job title; extracting one or more skills from the plurality of sentences, wherein the extracted one or more skills correspond to one or more skills in a skill taxonomy; generating a frequency distribution over the extracted one or more skills; ranking each skill of the extracted one or more skills based on the frequency distribution; comparing the frequency distribution to a threshold skill distribution; in response to determining that the frequency distribution does not satisfy the threshold skill distribution, generating additional sentences from the input text; generating an additional frequency distribution using the plurality of sentences and the additional sentences; comparing the additional frequency distribution to the threshold skill distribution; in response to determining that the additional frequency distribution satisfies the threshold skill distribution, ranking each skill of the extracted one or more skills based on the additional frequency distribution; generating a subset of the extracted one or more skills based on the ranking and a threshold number of skills; and providing the subset of the extracted one or more skills to a downstream operation, process, or service of the online system. 2. The method of claim 1 , wherein extracting the one or more skills from the plurality of sentences comprises: identifying one or more skills in the plurality of sentences that do not correspond to the one or more skills in the skill taxonomy; and adding the identified one or more skills to the skill taxonomy. 3. The method of claim 1 , further comprising generating a skill recommendation based on comparing the subset of the extracted one or more skills with a set of skills identified in an entity profile. 4. The method of claim 1 , further comprising training the generative text machine learning model by applying unsupervised machine learning to a domain-independent set of unlabeled and unstructured training data. 5. The method of claim 1 , wherein extracting the input text from the online system comprises: receiving an input seed phrase from a user system; and determining the input text based on the input seed phrase. 6. The method of claim 1 , further comprising training the generative text machine learning model as a causal text generator that generates a set of sentences from a seed word or a job title. 7. A method comprising: extracting, using a string search, an input text from an online system, the input text comprising a job title; applying an unsupervised generative text machine learning model to the input text; generating, by the generative text machine learning model, a plurality of sentences from the job title; extracting one or more skills from the plurality of sentences, wherein the extracted one or more skills correspond to one or more skills in a skill taxonomy; generating a frequency distribution over the extracted one or more skills, the frequency distribution generated by aggregating a number of occurrences for each skill of the extracted one or more skills; ranking each skill of the extracted one or more skills based on the frequency distribution; comparing the frequency distribution to a threshold skill distribution; in response to determining that the frequency distribution does not satisfy the threshold skill distribution, generating additional sentences from the input text; generating an additional frequency distribution using the plurality of sentences and the additional sentences; comparing the additional frequency distribution to the threshold skill distribution; in response to determining that the additional frequency distribution satisfies the threshold skill distribution, ranking each skill of the extracted one or more skills based on the additional frequency distribution; generating a subset of the extracted one or more skills by selecting the subset using the ranking and a threshold number of skills, wherein the threshold number of skills define the number of skills in the subset; and providing the subset of the extracted one or more skills to a downstream operation, process, or service of the online system. 8. The method of claim 7 , wherein extracting the one or more skills from the plurality of sentences comprises: identifying one or more skills that do not correspond to one or more skills in the skill taxonomy; and adding the identified one or more skills to the skill taxonomy. 9. The method of claim 7 , further comprising generating a recommended skill based on comparing the subset of the extracted one or more skills with a set of skills identified in an entity profile. 10. The method of claim 7 further comprising configuring the generative text machine learning model as a causal text generator that generates a set of sentences from a seed word or a job title. 11. The method of claim 7 , wherein extracting the input text from the online system comprises: receiving an input seed phrase from a user system; and determining the input text from the input seed phrase. 12. The method of claim 7 , further comprising training the generative text machine learning model by applying unsupervised machine learning to a domain-independent set of unlabeled and unstructured training data. 13. A system comprising: at least one memory device; and a processing device, operatively coupled to the at least one memory device, to: extract an input text from an online system, the input text comprising a job title; apply an unsupervised generative text machine learning model to the input text; generate, by the unsupervised generative text machine learning model, a plurality of sentences based on the job title; extract one or more skills from the plurality of sentences, wherein the extracted one or more skills correspond to one or more skills in a skill taxonomy; generate a frequency distribution over the extracted one or more skills; compare the frequency distribution to a threshold skill distribution; in response to determining that the frequency distribution does not satisfy the threshold skill distribution, generate additional sentences from the input text; generate an additional frequency distribution using the plurality of sentences and the additional sentences; compare the additional frequency distribution to the threshold skill distribution; and provide the extracted one or more skills to a downstream operation, process, or service of the online system. 14. The system of claim 13 , wherein to extract the input text from the online system, the processing device is to: receive an input seed phrase from a user system; and determine the input text from the input seed phrase. 15. The system of claim 13 , wherein the processing device trains the generative text machine learning model as a causal text generator that generates a set of sentences from a seed word or a job title. 16. The system of claim 13 , wherein the processing device generates a skill recommendation based on comparing the extracted one or more skills with a set of skills identified in an entity profile. 17. The system of claim 13 , wherein to extract the one or more skills from the plurality of sentences, the processing device is caused to: identify one or more skills that do not correspond to a skill in the skill tax

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12333501B2 cover?
Technologies for skill taxonomy management are described. Embodiments include extracting an input text from an online system and applying an unsupervised generative text machine learning model to the input text. The text generator generates a set of sentences based on a job title included in the input text. One or more skills are extracted from the set of sentences. The extracted one or more sk…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06Q10/1053. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 17 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).