Automated news ranking and recommendation system
US-11334949-B2 · May 17, 2022 · US
US11609748B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11609748-B2 |
| Application number | US-202117161545-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 28, 2021 |
| Priority date | Jan 28, 2021 |
| Publication date | Mar 21, 2023 |
| Grant date | Mar 21, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method may include obtaining machine-readable source code. The method may include parsing the source code for one or more code descriptions and identifying a section of the source code corresponding to each of the code descriptions. The method may include determining a description-code pair including a first element representing the code description and a second element representing the section of the source code corresponding to the code description. The method may include generating an augmented programming language corpus based on the description-code pair, the one or more code descriptions, and the source code. The method may include receiving a natural language search query for source-code recommendations, identifying source code from the augmented programming language corpus responsive to the natural language search query, and responding to the natural language search query with the identified source code.
Opening claim text (preview).
What is claimed is: 1. A method, comprising: obtaining, by a processor, machine-readable source code; parsing, by the processor, the source code for one or more code descriptions; identifying, by the processor, a section of the source code corresponding to a code description of the one or more code descriptions; determining, by the processor, a description-code pair, the description-code pair including a first element representing the code description and a second element representing the section of the source code corresponding to the code description; generating, by the processor, an augmented programming language corpus using the description-code pair, the one or more code descriptions, and the source code; training, by the processor, a machine learning model to provide source-code recommendations based on the augmented programming language corpus; receiving, by the processor, a natural language search query for a source-code recommendation; identifying, by the processor using the machine learning model, the source code responsive to the search query; and responding, by the processor, to the natural language search query with the source code identified from the augmented programming language corpus. 2. The method of claim 1 , wherein the one or more code descriptions are code comments and identifying the section of the source code corresponding to a code description of the one or more code descriptions comprises: determining, by the processor, one or more heuristics relating a location of the code comment in a piece of source code to the section of the source code; determining, by the processor, the location of the code comment in the piece of source code; and locating, by the processor, the section of the source code to which the code comment corresponds based on the one or more heuristics and the location of the code comment in the piece of source code. 3. The method of claim 1 , wherein the natural language search query is received via a text-input field in an integrated development environment (IDE), the IDE including an interface for software development. 4. The method of claim 1 , wherein obtaining source code comprises: obtaining, by the processor, a source-code package; parsing, by the processor, the source-code package to identify one or more files, each file of the one or more files including at least a portion of the source code; and parsing, by the processor, the one or more files to identify files written in a target programming language. 5. The method of claim 1 , further comprising: generating, by the processor, a negatively classified example based on the description-code pair; and training, by the processor, the machine learning model to provide source-code recommendations based on the augmented programming language corpus and the negatively classified example. 6. The method of claim 1 , wherein responding to the natural language search query with the source code identified from the augmented programming language corpus comprises: mapping, by the processor, the natural language search query to a search vector; comparing, by the processor, the search vector to each description-code pair; determining, by the processor, a similarity score between the search vector and each description-code pair based on a cosine similarity between the search vector and each description-code pair; and returning, by the processor, the source code corresponding to the description-code pair based on the similarity score between the search vector and each description-code pair. 7. The method of claim 6 , wherein returning the source code corresponding to the description-code pair based on the similarity score between the search vector and each description-code pair comprises: ranking, by the processor, description-code pairs based on the similarity score between the search vector and each description-code pair; and returning, by the processor, one or more pieces of the source code corresponding to the description-code pairs based on the ranking. 8. One or more non-transitory computer-readable storage media configured to store instructions that, in response to being executed, cause a system to perform operations, the operations comprising: obtaining machine-readable source code; parsing the source code for one or more code descriptions; identifying a section of the source code corresponding to a code description of the one or more code descriptions; determining a description-code pair, the description-code pair including a first element representing the code description and a second element representing the section of the source code corresponding to the code description; generating an augmented programming language corpus using the description-code pair, the one or more code descriptions, and the source code; training a machine learning model to provide source-code recommendations based on the augmented programming language corpus; receiving a natural language search query for a source-code recommendation; identifying, by the machine learning model, the source code from the augmented programming language corpus responsive to the natural language search query; and responding to the natural language search query with the source code identified from the augmented programming language corpus. 9. The one or more non-transitory computer-readable storage media of claim 8 , wherein the one or more code descriptions are code comments and identifying the section of the source code corresponding to a code description of the one or more code descriptions comprises: determining one or more heuristics relating a location of the code comment in a piece of source code to the section of the source code; determining the location of the code comment in the piece of source code; and locating the section of the source code to which the code comment corresponds based on the one or more heuristics and the location of the code comment in the piece of source code. 10. The one or more non-transitory computer-readable storage media of claim 8 , wherein the natural language search query is received via a text-input field in an integrated development environment (IDE), the IDE including an interface for software development. 11. The one or more non-transitory computer-readable storage media of claim 8 , wherein obtaining source code comprises: obtaining a source-code package; parsing the source-code package to identify one or more files, each file of the one or more files including at least a portion of the source code; and parsing the one or more files to identify files written in a target programming language. 12. The one or more non-transitory computer-readable storage media of claim 8 , further comprising: generating a negatively classified example based on the description-code pair; and training the machine learning model to provide source-code recommendations based on the augmented programming language corpus and the negatively classified example. 13. The one or more non-transitory computer-readable storage media of claim 8 , wherein responding to the natural language search query with the source code identified from the augmented programming language corpus comprises: mapping the natural language search query to a search vector; comparing the search vector to each description-code pair; determining a similarity score between the search vector and each description-code pair based on a cosine similarity between the search vector and each description-code pair; and returning the source code corresponding to the description-code pair based on the similarity score between the search vector and each description-code pair.
Related publications grouped by family.
Answers are generated from the same data shown on this page.