Adapting existing source code snippets to new contexts
US-2022236971-A1 · Jul 28, 2022 · US
US11681541B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11681541-B2 |
| Application number | US-202117555072-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 17, 2021 |
| Priority date | Dec 17, 2021 |
| Publication date | Jun 20, 2023 |
| Grant date | Jun 20, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods, apparatus, systems, and articles of manufacture are disclosed to generate usage dependent code embeddings. An example apparatus includes parsing circuitry to select a usage context of a code snippet including at least one line of code (LOC) before the code snippet or an LOC at which the code snippet is called, the code snippet, and at least one LOC after the code snippet or the LOC. The example apparatus additionally includes embedding circuitry to generate a first list of token embedding vectors for first tokens of a second list of tokens for the code snippet and a third list of token embedding vectors for second tokens of a fourth list of tokens for the usage context. The example apparatus also includes concatenation circuitry to concatenate a transformed token embedding vector of a close token and a fifth list of transformed token embedding vectors for the first list.
Opening claim text (preview).
What is claimed is: 1. An apparatus to generate usage dependent code embeddings, the apparatus comprising: interface circuitry to obtain code including a code snippet to be processed by an artificial intelligence (AI) model; and processor circuitry including one or more of: at least one of a central processor unit (CPU), a graphics processor unit (GPU), or a digital signal processor (DSP), the at least one of the CPU, the GPU, or the DSP having control circuitry to control data movement within the processor circuitry, arithmetic and first logic circuitry to perform one or more first operations corresponding to instructions, and one or more registers to store a first result of the one or more first operations, the instructions in the apparatus; Field Programmable Gate Array (FPGA) circuitry, the FPGA circuitry including second logic gate circuitry, a plurality of configurable interconnections, and storage circuitry, the second logic gate circuitry and interconnections to perform one or more second operations, the storage circuitry to store a second result of the one or more second operations; or Application Specific Integrated Circuitry (ASIC) including third logic gate circuitry to perform one or more third operations; the processor circuitry to perform at least one of the first operations, the second operations, or the third operations to instantiate: parsing circuitry to select a usage context of the code snippet including at least one line of code (LOC) before the code snippet or an LOC at which the code snippet is called, the code snippet, and at least one LOC after the code snippet or the LOC at which the code snippet is called; embedding circuitry to generate a first list of one or more token embedding vectors for first tokens of a second list of one or more tokens for the code snippet and a third list of one or more token embedding vectors for second tokens of a fourth list of one or more tokens for the usage context, the fourth list including a close token; and concatenation circuitry to concatenate a transformed token embedding vector of the close token and a fifth list of one or more transformed token embedding vectors for the first list of one or more token embedding vectors. 2. The apparatus of claim 1 , wherein the processor circuitry is to perform at least one of the first operations, the second operations, or the third operations to instantiate the concatenation circuitry to prepend the transformed token embedding vector of the close token to the fifth list of one or more transformed token embedding vectors. 3. The apparatus of claim 1 , wherein the processor circuitry is to perform at least one of the first operations, the second operations, or the third operations to instantiate transform circuitry to generate the fifth list of one or more transformed token embedding vectors for the first list of one or more token embedding vectors and a sixth list of one or more transformed token embedding vectors for the third list of one or more token embedding vectors. 4. The apparatus of claim 1 , wherein the processor circuitry is first processor circuitry, and the first processor circuitry is to perform at least one of the first operations, the second operations, or the third operations to instantiate: the concatenation circuitry to append a close vector to a concatenated list of the transformed token embedding vector of the close token and the fifth list of one or more transformed token embedding vectors; and transform circuitry to: generate a transformed concatenated list of the transformed token embedding vector of the close token, the fifth list of one or more transformed token embedding vectors, and the close vector; and transmit at least one of the transformed concatenated list or a transformed close vector to second processor circuitry implementing the AI model. 5. The apparatus of claim 1 , wherein at least one of the at least one LOC before the code snippet or the LOC at which the code snippet is called or the at least one LOC after the code snippet or the LOC at which the code snippet is called provides the AI model with information about the code snippet to be processed including information about arguments of the code snippet, information about how an output of the code snippet is used, or information about a programming context in which the code snippet is used. 6. The apparatus of claim 1 , wherein the at least one LOC before the code snippet or the LOC at which the code snippet is called and the at least one LOC after the code snippet or the LOC at which the code snippet is called correspond to a threshold number of LOCs. 7. The apparatus of claim 1 , wherein the at least one of LOC before the code snippet or the LOC at which the code snippet is called corresponds to a first threshold number of LOCs and the at least one LOC after the code snippet or the LOC at which the code snippet is called corresponds to a second threshold number of LOCs different from the first threshold number of LOCs. 8. A non-transitory computer readable medium comprising machine-readable instructions which, when executed, cause processor circuitry to: obtain code including a code snippet to be processed by an artificial intelligence (AI) model; select a usage context of the code snippet including at least one line of code (LOC) before the code snippet or an LOC at which the code snippet is called, the code snippet, and at least one LOC after the code snippet or the LOC at which the code snippet is called; generate a first list of one or more token embedding vectors for first tokens of a second list of one or more tokens for the code snippet and a third list of one or more token embedding vectors for second tokens of a fourth list of one or more tokens for the usage context, the fourth list including a close token; and concatenate a transformed token embedding vector of the close token and a fifth list of one or more transformed token embedding vectors for the first list of one or more token embedding vectors. 9. The non-transitory computer readable medium of claim 8 , wherein the instructions cause the processor circuitry to prepend the transformed token embedding vector of the close token to the fifth list of one or more transformed token embedding vectors. 10. The non-transitory computer readable medium of claim 8 , wherein the instructions cause the processor circuitry to generate the fifth list of one or more transformed token embedding vectors for the first list of one or more token embedding vectors and a sixth list of one or more transformed token embedding vectors for the third list of one or more token embedding vectors. 11. The non-transitory computer readable medium of claim 8 , wherein the processor circuitry is first processor circuitry, and the instructions cause the first processor circuitry to: append a close vector to a concatenated list of the transformed token embedding vector of the close token and the fifth list of one or more transformed token embedding vectors; generate a transformed concatenated list of the transformed token embedding vector of the close token, the fifth list of one or more transformed token embedding vectors, and the close vector; and transmit at least one of the transformed concatenated list or a transformed close vector to second processor circuitry implementing the AI model. 12. The non-transitory computer readable medium of claim 8 , wherein at least one of the at least one LOC before the code snippet or the LOC at which the code snippet is called or the at least one LOC after the code snippet or the LOC at which the code snippet is called provides the AI model with information about the code snippet to be processed inclu
Embedded in an application, e.g. JavaScript in a Web browser · CPC title
Program documentation · CPC title
Structural analysis for program understanding · CPC title
Code clone detection · CPC title
Code refactoring · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.