Methods, apparatus, and articles of manufacture to generate usage dependent code embeddings

US11681541B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11681541-B2
Application numberUS-202117555072-A
CountryUS
Kind codeB2
Filing dateDec 17, 2021
Priority dateDec 17, 2021
Publication dateJun 20, 2023
Grant dateJun 20, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, apparatus, systems, and articles of manufacture are disclosed to generate usage dependent code embeddings. An example apparatus includes parsing circuitry to select a usage context of a code snippet including at least one line of code (LOC) before the code snippet or an LOC at which the code snippet is called, the code snippet, and at least one LOC after the code snippet or the LOC. The example apparatus additionally includes embedding circuitry to generate a first list of token embedding vectors for first tokens of a second list of tokens for the code snippet and a third list of token embedding vectors for second tokens of a fourth list of tokens for the usage context. The example apparatus also includes concatenation circuitry to concatenate a transformed token embedding vector of a close token and a fifth list of transformed token embedding vectors for the first list.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus to generate usage dependent code embeddings, the apparatus comprising: interface circuitry to obtain code including a code snippet to be processed by an artificial intelligence (AI) model; and processor circuitry including one or more of: at least one of a central processor unit (CPU), a graphics processor unit (GPU), or a digital signal processor (DSP), the at least one of the CPU, the GPU, or the DSP having control circuitry to control data movement within the processor circuitry, arithmetic and first logic circuitry to perform one or more first operations corresponding to instructions, and one or more registers to store a first result of the one or more first operations, the instructions in the apparatus; Field Programmable Gate Array (FPGA) circuitry, the FPGA circuitry including second logic gate circuitry, a plurality of configurable interconnections, and storage circuitry, the second logic gate circuitry and interconnections to perform one or more second operations, the storage circuitry to store a second result of the one or more second operations; or Application Specific Integrated Circuitry (ASIC) including third logic gate circuitry to perform one or more third operations; the processor circuitry to perform at least one of the first operations, the second operations, or the third operations to instantiate: parsing circuitry to select a usage context of the code snippet including at least one line of code (LOC) before the code snippet or an LOC at which the code snippet is called, the code snippet, and at least one LOC after the code snippet or the LOC at which the code snippet is called; embedding circuitry to generate a first list of one or more token embedding vectors for first tokens of a second list of one or more tokens for the code snippet and a third list of one or more token embedding vectors for second tokens of a fourth list of one or more tokens for the usage context, the fourth list including a close token; and concatenation circuitry to concatenate a transformed token embedding vector of the close token and a fifth list of one or more transformed token embedding vectors for the first list of one or more token embedding vectors. 2. The apparatus of claim 1 , wherein the processor circuitry is to perform at least one of the first operations, the second operations, or the third operations to instantiate the concatenation circuitry to prepend the transformed token embedding vector of the close token to the fifth list of one or more transformed token embedding vectors. 3. The apparatus of claim 1 , wherein the processor circuitry is to perform at least one of the first operations, the second operations, or the third operations to instantiate transform circuitry to generate the fifth list of one or more transformed token embedding vectors for the first list of one or more token embedding vectors and a sixth list of one or more transformed token embedding vectors for the third list of one or more token embedding vectors. 4. The apparatus of claim 1 , wherein the processor circuitry is first processor circuitry, and the first processor circuitry is to perform at least one of the first operations, the second operations, or the third operations to instantiate: the concatenation circuitry to append a close vector to a concatenated list of the transformed token embedding vector of the close token and the fifth list of one or more transformed token embedding vectors; and transform circuitry to: generate a transformed concatenated list of the transformed token embedding vector of the close token, the fifth list of one or more transformed token embedding vectors, and the close vector; and transmit at least one of the transformed concatenated list or a transformed close vector to second processor circuitry implementing the AI model. 5. The apparatus of claim 1 , wherein at least one of the at least one LOC before the code snippet or the LOC at which the code snippet is called or the at least one LOC after the code snippet or the LOC at which the code snippet is called provides the AI model with information about the code snippet to be processed including information about arguments of the code snippet, information about how an output of the code snippet is used, or information about a programming context in which the code snippet is used. 6. The apparatus of claim 1 , wherein the at least one LOC before the code snippet or the LOC at which the code snippet is called and the at least one LOC after the code snippet or the LOC at which the code snippet is called correspond to a threshold number of LOCs. 7. The apparatus of claim 1 , wherein the at least one of LOC before the code snippet or the LOC at which the code snippet is called corresponds to a first threshold number of LOCs and the at least one LOC after the code snippet or the LOC at which the code snippet is called corresponds to a second threshold number of LOCs different from the first threshold number of LOCs. 8. A non-transitory computer readable medium comprising machine-readable instructions which, when executed, cause processor circuitry to: obtain code including a code snippet to be processed by an artificial intelligence (AI) model; select a usage context of the code snippet including at least one line of code (LOC) before the code snippet or an LOC at which the code snippet is called, the code snippet, and at least one LOC after the code snippet or the LOC at which the code snippet is called; generate a first list of one or more token embedding vectors for first tokens of a second list of one or more tokens for the code snippet and a third list of one or more token embedding vectors for second tokens of a fourth list of one or more tokens for the usage context, the fourth list including a close token; and concatenate a transformed token embedding vector of the close token and a fifth list of one or more transformed token embedding vectors for the first list of one or more token embedding vectors. 9. The non-transitory computer readable medium of claim 8 , wherein the instructions cause the processor circuitry to prepend the transformed token embedding vector of the close token to the fifth list of one or more transformed token embedding vectors. 10. The non-transitory computer readable medium of claim 8 , wherein the instructions cause the processor circuitry to generate the fifth list of one or more transformed token embedding vectors for the first list of one or more token embedding vectors and a sixth list of one or more transformed token embedding vectors for the third list of one or more token embedding vectors. 11. The non-transitory computer readable medium of claim 8 , wherein the processor circuitry is first processor circuitry, and the instructions cause the first processor circuitry to: append a close vector to a concatenated list of the transformed token embedding vector of the close token and the fifth list of one or more transformed token embedding vectors; generate a transformed concatenated list of the transformed token embedding vector of the close token, the fifth list of one or more transformed token embedding vectors, and the close vector; and transmit at least one of the transformed concatenated list or a transformed close vector to second processor circuitry implementing the AI model. 12. The non-transitory computer readable medium of claim 8 , wherein at least one of the at least one LOC before the code snippet or the LOC at which the code snippet is called or the at least one LOC after the code snippet or the LOC at which the code snippet is called provides the AI model with information about the code snippet to be processed inclu

Assignees

Inventors

Classifications

  • Embedded in an application, e.g. JavaScript in a Web browser · CPC title

  • Program documentation · CPC title

  • Structural analysis for program understanding · CPC title

  • G06F8/751Primary

    Code clone detection · CPC title

  • G06F8/72Primary

    Code refactoring · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11681541B2 cover?
Methods, apparatus, systems, and articles of manufacture are disclosed to generate usage dependent code embeddings. An example apparatus includes parsing circuitry to select a usage context of a code snippet including at least one line of code (LOC) before the code snippet or an LOC at which the code snippet is called, the code snippet, and at least one LOC after the code snippet or the LOC. Th…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F9/45529. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 20 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).