Training a machine learning model for analysis of instruction sequences

US2018075349A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2018075349-A1
Application numberUS-201615345436-A
CountryUS
Kind codeA1
Filing dateNov 7, 2016
Priority dateSep 9, 2016
Publication dateMar 15, 2018
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In one respect, there is provided a system for training a neural network adapted for classifying one or more instruction sequences. The system may include at least one processor and at least one memory. The memory may include program code which when executed by the at least one processor provides operations including: training, based at least on training data, a machine learning model to detect one or more predetermined interdependencies amongst a plurality of tokens in the training data; and providing the trained machine learning model to enable classification of one or more instruction sequences. Related methods and articles of manufacture, including computer program products, are also provided.

First claim

Opening claim text (preview).

1 . A system, comprising: at least one processor; and at least one memory including program code which when executed by the at least one processor provides operations comprising: training, based at least on training data, a machine learning model to detect one or more predetermined interdependencies amongst a plurality of tokens in the training data; and providing the trained machine learning model to enable classification of one or more instruction sequences. 2 . The system of claim 1 further comprising: receiving the training data, wherein the training data comprises a sequence of instructions. 3 . The system of claim 2 , wherein the sequence of instructions includes the plurality of tokens, and wherein at least one of the plurality of tokens comprises at least one character and/or binary digit. 4 . The system of claim 1 , wherein the one or more predetermined interdependencies include a presence, in the training data, of a first token subsequent to at least a second token. 5 . The system of claim 1 , wherein the one or more predetermined interdependencies indicate at least one function and/or behavior associated with the training data. 6 . The system of claim 1 , wherein the machine learning model comprises a neural network. 7 . The system of claim 6 , wherein the neural network comprises a long short-term memory neural network. 8 . The system of claim 7 , wherein the long short-term memory neural network comprises an embedding layer configured to generate vector representations of the plurality of tokens in the training data. 9 . The system of claim 8 , wherein the embedding layer is configured to use one-hot encoding to generate the vector representations of the plurality of tokens in the training data. 10 . The system of claim 8 , wherein the embedding layer is configured word2vec to generate the vector representations of the plurality of tokens in the training data. 11 . The system of claim 7 , wherein the long short-term memory neural network comprises a first long short-term memory layer, and wherein the first long short-term memory layer comprises a memory cell having an input gate, an output gate, and a forget gate. 12 . The system of claim 11 , wherein the first long short-term memory layer is configured to receive a first token from the plurality of tokens included in the training data. 13 . The system of claim 12 , wherein a current hidden state of the first long short-term memory layer is determined based at least on the first token and a previous hidden state of the first long-short term memory layer, wherein the previous hidden state of the first long short-term memory layer corresponds to one or more tokens already processed by the long short-term memory neural network, and wherein an output of the first long-short term memory layer corresponds to the current hidden state of the first long short-term memory layer. 14 . The system of claim 11 , wherein the long short-term memory neural network further comprises a second long short-term memory layer, wherein the first long short-term memory layer is configured to detect one or more predetermined interdependencies in one direction by at least processing the plurality of tokens in a forward order, and wherein the second long short-term memory layer is configured to detect one or more other predetermined interdependencies in an opposite direction by at least processing the plurality of tokens in an opposite order. 15 . The system of claim 6 , wherein the neural network comprises a recursive neural tensor network. 16 . The system of claim 15 , wherein training the recursive neural tensor network includes processing an abstract syntax tree representation of the training data with the recursive neural tensor network. 17 . The system of claim 16 , further comprising: generating, based at least on a structure of the plurality of tokens in the training data, the abstract syntax tree representation of the training data. 18 . The system of claim 17 , wherein the abstract syntax tree representation of the training data includes a parent node corresponding to a first token from the plurality of tokens in the training data, and a leaf node corresponding to a second token from the plurality of tokens in the training data, and wherein the leaf node comprises a child node of the parent node. 19 . The system of claim 18 , wherein the first token indicates a rule for combining the second token and a third token from the plurality of tokens in the training data. 20 . The system of claim 19 , wherein the parent node is associated with a weight that is determined based at least on a first weight and a first tensor associated with the leaf node, and a second weight and a second tensor associated with another leaf node corresponding to the third token. 21 .- 38 . (canceled)

Assignees

Inventors

Classifications

  • G06F21/563Primary

    by source code analysis · CPC title

  • Recurrent networks, e.g. Hopfield networks · CPC title

  • G06N3/0442Primary

    characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title

  • Supervised learning · CPC title

  • G06N3/08Primary

    Learning methods · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2018075349A1 cover?
In one respect, there is provided a system for training a neural network adapted for classifying one or more instruction sequences. The system may include at least one processor and at least one memory. The memory may include program code which when executed by the at least one processor provides operations including: training, based at least on training data, a machine learning model to detect…
Who is the assignee on this patent?
Cylance Inc
What technology area does this patent fall under?
Primary CPC classification G06F21/563. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Mar 15 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).