Generating web api specification from online documentation

US2018196643A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2018196643-A1
Application numberUS-201715403150-A
CountryUS
Kind codeA1
Filing dateJan 10, 2017
Priority dateJan 10, 2017
Publication dateJul 12, 2018
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A tool that automatically generates a web API specification from a web API documentation is provided. The tool extracts a base uniform resource locator (URL) string from the received documentation by identifying URL strings in the documentation that are valid web application programming interface (API) calls. The tool infers path templates by identifying and clustering path expressions in the documentation that invoke the same URL endpoints. The tool extracts hypertext transfer protocol (HTTP) request type and query parameters associated with the inferred path templates. The tool generates a specification that includes the extracted base URL, the inferred path templates, the extracted HTTP request types, and the extracted query parameters.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer program product comprising: one or more non-transitory computer-readable storage device and program instructions stored on at least one of the one or more non-transitory storage devices, the program instructions executable by a processor, the program instructions comprising sets of instructions for: receiving a documentation; extracting a base uniform resource locator (URL) string from the received documentation by identifying URL strings in the documentation that are valid web application programming interface (API) calls; inferring one or more path template by identifying and clustering path expressions in the documentation that invoke the same URL endpoints; extracting a hypertext transfer protocol (HTTP) request type and a query parameter associated with at least one path template; and generating a specification comprising the extracted base URL string, the inferred path template, the extracted HTTP request type, and the extracted query parameter. 2 . The computer program product of claim 1 , wherein the set of instructions for identifying whether a URL string is a valid web API call comprises a set of instructions for classifying the URL string based on a set of features of the URL string and a set of features regarding a documentation page from which the URL string is extracted. 3 . The computer program product of claim 2 , wherein the set of instructions for classifying the URL string based on the set of features of the URL string comprises a set of instructions for determining whether the URL string includes a path parameter, a query string, a version number, or a substring indicating an API call. 4 . The computer program product of claim 2 , wherein the set of instructions for classifying the URL string comprises a set of instructions for executing a command that transfers data from or to a server based on the URL string and classifying the URL string based on a return value of the command. 5 . The computer program product of claim 2 , wherein the set of features of the URL string comprises a context for the URL string in the documentation. 6 . The computer program product of claim 5 , wherein the context of the URL string is at least one of: whether the URL string is a hyperlink that leads to another web page, whether the URL string appear between a pair of tags that defines a piece of computer code, whether the URL string is in valid JavaScript Object Notation (JSON) within a pair of matched Hypertext Markup Language (HTML) tags, and whether the URL string has a same host name as the URL of the documentation. 7 . The computer program product of claim 1 , wherein the set of instructions for extracting the base URL string comprises a set of instructions for identifying a longest common prefix from among the identified URL strings that are valid web API calls. 8 . The computer program product of claim 1 , wherein the set of instructions for inferring a path template by identifying and clustering path expressions comprises a set of instructions for inferring path parameters within each cluster and using the inferred path parameters to further identify and cluster path expressions. 9 . The computer program product of claim 1 , wherein the set of instructions for identifying and clustering path expressions that invoke the same URL endpoints comprises a set of instructions for determining whether each segment of each path expression is a fixed segment of an endpoint, a path parameter, or an instantiated value of a path parameter. 10 . A computer program product comprising: one or more non-transitory computer-readable storage device and program instructions stored on at least one of the one or more non-transitory storage devices, the program instructions executable by a processor, the program instructions comprising sets of instructions for: receiving a documentation; extracting path expressions from the received documentation, each path expression comprising a plurality of path segments; grouping the extracted path expressions into one or more clusters, wherein first and second path expressions are grouped into a same cluster when a distance between the first and second path expressions is within a threshold distance, wherein the distance between the first and second path expressions is determined based on differences between the path segments of the first path expression and the path segments of the second path expression at each path segment position; inferring a path template from each cluster; and generating a specification comprising the inferred path template. 11 . The computer program product of claim 10 , wherein first and second path expressions are not grouped to a same cluster when the first path expression and the second path expression each has a different number of path segments. 12 . The computer program product of claim 10 , wherein a literal value and a path parameter at a path segment position contributes more to the distance between the first and second path expressions than two identical literal values at the path segment position. 13 . The computer program product of claim 10 , wherein a literal value and a path parameter at a path segment position contributes less to the distance between the first and second path expressions than two different literal values at the path segment position. 14 . The computer program product of claim 10 , wherein each cluster of path expressions invokes a same universal resource locator (URL) endpoint. 15 . The computer program product of claim 10 , wherein the programming instructions further comprising a set of instructions for inferring path parameters in path expressions and grouping path expressions into clusters based on the inferred path parameters. 16 . The computer program product of claim 10 , wherein the programming instructions further comprising a set of instructions for determining whether each path segment of each path expression is a fixed segment of a URL endpoint, a path parameter, or an instantiated value of a path parameter. 17 . A computing device comprising: a set of one or more processing units; and a storage device storing a set of instructions, wherein an execution of the set of instructions by the set of processing units configures the computing device to perform acts comprising: receiving a documentation; extracting a base uniform resource locator (URL) string from the received documentation by identifying URL strings in the documentation that are valid web application programming interface (API) calls; inferring one or more path template by identifying and clustering path expressions in the documentation that invoke the same URL endpoints; extracting a hypertext transfer protocol (HTTP) request type and a query parameter associated with at least one path template; and generating a specification comprising the extracted base URL, the inferred path template, the extracted HTTP request type, and the extracted query parameter. 18 . The computing device of claim 17 , wherein identifying whether a URL string is a valid web API call comprises classifying the URL string based on a set of features of the URL string and a set of features regarding a documentation page from which the URL string is extracted. 19 . The computing device of claim 17 , wherein: the execution of the set of instructions by the set of processing units further configures the computing device to perform the acts of: constructing a document object model (DOM) tree that represents the documentation,

Assignees

Inventors

Classifications

  • for remote control or remote monitoring of applications · CPC title

  • involving the movement of software or configuration parameters  (network booting or remote initial program loading [RIPL] G06F9/4416) · CPC title

  • G06F8/10Primary

    Requirements analysis; Specification techniques · CPC title

  • Document structures and storage, e.g. HTML extensions · CPC title

  • Software reuse · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2018196643A1 cover?
A tool that automatically generates a web API specification from a web API documentation is provided. The tool extracts a base uniform resource locator (URL) string from the received documentation by identifying URL strings in the documentation that are valid web application programming interface (API) calls. The tool infers path templates by identifying and clustering path expressions in the d…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F8/10. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jul 12 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).