Inter-version mapping of distributed file systems

US11080244B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11080244-B2
Application numberUS-201414288506-A
CountryUS
Kind codeB2
Filing dateMay 28, 2014
Priority dateMay 28, 2014
Publication dateAug 3, 2021
Grant dateAug 3, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems, methods, and software described herein to provide data to large-scale processing framework (LSPF) nodes in LSPF clusters. In one example, a method to provide data includes receiving an access request from a LSPF node to access data in accordance with a version of a distributed file system. The method further includes, responsive to the access request, accessing the data for the LSPF node in accordance with a different version of the distributed file system, and presenting the data to the LSPF node in accordance with the version of the distributed file system used by the LSPF node.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus comprising: one or more non-transitory computer readable storage media; a processing system operatively coupled with the one or more non-transitory computer readable storage media; and processing instructions stored on the one or more non-transitory computer readable storage media to implement a data service that, when executed by the processing system, direct the processing system to: identify, on a first large-scale processing framework (LSPF) node of a first SPF cluster of a plurality of LSPF clusters, a first data access request generated by a process executing on the first LSPF node and, on a second LSPF node of a second LSPF cluster of the plurality of LSPF clusters, a second data access request generated by a process executing on the second LSPF node, wherein LSPF nodes of the first LSPF cluster comprise virtual computing nodes on one or more of host computing systems to perform parallel processing of a first data set according to a first version of a distributed file system and LSPF nodes of the second LSPF cluster comprise virtual computing nodes on one or more of the host computing systems to perform parallel processing of a second data set according to a second version of the distributed file system; in response to identification of the first data access request and the second data access request, notify the data service executing on one or more of the host computing systems regarding the first data access request and the second data access request; receive, by the data service, the first data access request from the first LSPF node of the first LSPF cluster to access first requested data from a data repository, the first data access request comprising a request in accordance with the first version of the distributed file system, and receive the second data access request from the second LSPF node of the second LSPF cluster to access second requested data from the data repository, the second data access request comprising a request in accordance with the second version of the distributed file system, wherein the data service is shared by LSPF nodes of the plurality of LSPF clusters, and wherein data is stored in the data repository using a plurality of versions of the distributed file system, including a third version of the distributed file system, different from the first version and the second version; responsive to the first data access request and the second data access request, determine that the first requested data and the second requested data are stored using the third version of the distributed file system and access the data in the data repository for the first LSPF node and for the second LSPF node in accordance with the third version of the distributed file system; and present the first requested data accessed from the data repository to the first LSPF node in accordance with the first version of the distributed file system used by the first LSPF node, and present the second requested data accessed from the data repository to the second LSPF node in accordance with the second version of the distributed file system used by the second LSPF node. 2. The apparatus of claim 1 wherein the distributed file system comprises one Hadoop distributed file system. 3. The apparatus of claim 1 wherein the distributed file system comprises a Cluster file system. 4. The apparatus of claim 1 wherein the first LSPF node is configured with a Hadoop framework or a Spark framework. 5. A method comprising: identifying, on a first large-scale processing framework (LSPF) node of a first LSPF duster of a plurality of LSPF dusters; a first data access request generated by a process executing on the first LSPF node and, on a second LSPF node of a second LSPF duster of the plurality of LSPF clusters; a second data access request generated by a process executing on the second LSPF node, wherein LSPF nodes of the first LSPF cluster comprise virtual computing nodes on one or more of host computing systems to perform parallel processing of a first data set according to a first version of a distributed file system and LSPF nodes of the second LSPF cluster comprise virtual computing nodes on one or more of the host computing systems to perform parallel processing of a second data set according to a second version of the distributed file system; in response to identification of the first data access request and the second data access request, notifying a data service executing on one or more of the host computing systems regarding the first data access request and the second data access request; receiving, by the data service, the first data access request from the first LSPF node of the first LSPF duster to access first requested data from a data repository, the first data access request comprising a request in accordance with the first version of the distributed file system, receiving, by the data service, the second data access request from the second LSPF node of the second LSPF duster to access second requested data from the data repository, the second data access request comprising a request in accordance with the second version of the distributed file system, wherein the data service is shared by LSPF nodes of the plurality of LSPF clusters, and wherein data is stored in the data repository using a plurality of versions of the distributed file system, including a third version of the distributed file system that is different from the first version and the second version; responsive to the first data access request and the second data access request, determining that the first requested data and the second requested data are stored using the third version of the distributed file system and access the data in the data repository for the first LSPF node and the second LSPF node in accordance with the third version of the distributed file system; presenting, by the data service, the first requested data accessed from the data repository to the first LSPF node in accordance with the first version of the distributed file system used by the first LSPF node; and presenting, by the data service, the second requested data accessed from the data repository to the second LSPF node in accordance with the second version of the distributed file system used by the second LSPF node. 6. The method of claim 5 wherein the distributed file system comprises a Hadoop distributed file system. 7. The method of claim 5 wherein the distributed file system comprises a Gluster file system. 8. The method of claim 5 wherein the first LSPF node is configured with a Hadoop framework or a Spark framework. 9. A non-transitory computer readable media storing instructions that, when executed by a processor, cause the processor to: identify, on a first large-scale processing framework (LSPF) node of a first LSPF cluster of a plurality of LSPF clusters, a first data access request generated by a process executing on the first LSPF node and, on a second LSPF node of a second LSPF cluster of the plurality of LSPF clusters, a second data access request generated by a process executing on the second LSPF node, wherein LSPF nodes of the first LSPF cluster comprise virtual computing nodes on one or more of host computing systems to perform parallel processing of a first data set according to a first version of a distributed file system and LSPF nodes of the second LSPF cluster comprise virtual computing nodes on one or more of the host computing systems to perform parallel processing of a second data set according to a second version of the distributed file system; in response to identification of the first data access request and the second data access request, notify a data service executing on one or more of the host computing systems regar

Assignees

Inventors

Classifications

  • Hypervisor-specific management and integration aspects · CPC title

  • Selecting among different versions · CPC title

  • via adapters, e.g. between incompatible applications · CPC title

  • I/O management, e.g. providing access to device drivers or storage · CPC title

  • G06F16/182Primary

    Distributed file systems · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11080244B2 cover?
Systems, methods, and software described herein to provide data to large-scale processing framework (LSPF) nodes in LSPF clusters. In one example, a method to provide data includes receiving an access request from a LSPF node to access data in accordance with a version of a distributed file system. The method further includes, responsive to the access request, accessing the data for the LSPF no…
Who is the assignee on this patent?
Hewlett Packard Entpr Dev Lp
What technology area does this patent fall under?
Primary CPC classification G06F9/45558. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 03 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).