Method and apparatus for processing data in process of expanding or reducing capacity of stream computing system

US11416283B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11416283-B2
Application numberUS-201916503145-A
CountryUS
Kind codeB2
Filing dateJul 3, 2019
Priority dateJul 23, 2018
Publication dateAug 16, 2022
Grant dateAug 16, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and apparatus for processing stream data are provided. The method may include: acquiring a to-be-adjusted number of target execution units, the target execution unit referring to a unit executing a target program segment in a stream computing system; adjusting a number of the target execution units in the stream computing system based on the to-be-adjusted number; determining, for a target execution unit in at least one target execution unit after the adjustment, an identifier set corresponding to the target execution unit, an identifier in the identifier set being used to indicate to-be-processed data; and processing, through the target execution unit, the to-be-processed data indicated by the identifier in the corresponding identifier set.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for processing stream data, comprising: acquiring a to-be-adjusted number of target execution units, the target execution unit referring to a unit executing a target program segment in a stream computing system; adjusting a number of the target execution units in the stream computing system based on the to-be-adjusted number; and determining, for a target execution unit in at least one target execution unit after the adjustment, an identifier set corresponding to the target execution unit, an identifier in the identifier set being used to indicate to-be-processed data; persisting, according to the identifier set to which the identifier of the to-be-processed data generated through running of an upstream execution unit of the target execution unit belongs, the to-be-processed data generated through the upstream execution unit of the target execution unit into a database; and processing, through the target execution unit, the to-be-processed data indicated by the identifier in the corresponding identifier set, wherein determining, for the target execution unit in at least one target execution unit after the adjustment, the identifier set corresponding to the target execution unit, comprises: adjusting a mapping relationship between each target execution unit and each identifier set according to a preset rule, wherein a total number of identifier sets remains unchanged before and after the adjustment of the number of the target execution units in the stream computing system. 2. The method according to claim 1 , wherein after the processing, through the target execution unit, the to-be-processed data indicated by the identifier in the corresponding identifier set, the method further comprises: sending indication information to the upstream execution unit of the target execution unit through the target execution unit, the indication information being used to indicate the to-be-processed data having been processed by the target execution unit. 3. The method according to claim 2 , wherein the processing, through the target execution unit, the to-be-processed data indicated by the identifier in the corresponding identifier set includes: starting the at least one target execution unit after the adjustment; and receiving and processing, through the started target execution unit, the to-be-processed data that is sent by the upstream execution unit of the target execution unit and has been determined, from the persisted to-be-processed data indicated by the identifier included in the identifier set corresponding to the target execution unit. 4. The method according to claim 1 , wherein the processing, through the target execution unit, the to-be-processed data indicated by the identifier in the corresponding identifier set includes: de-duplicating, according to a historical record of receiving the to-be-processed data by the target execution unit in the stream computing system, the to-be-processed data sent to the target execution unit by the upstream execution unit of the target execution unit; and processing, through the target execution unit, the de-duplicated to-be-processed data indicated by the identifier in the corresponding identifier set. 5. An apparatus for processing stream data, comprising: at least one processor; and a memory storing instructions, the instructions when executed by the at least one processor, cause the at least one processor to perform operations, the operations comprising: acquiring a to-be-adjusted number of target execution units, the target execution unit referring to a unit executing a target program segment in a stream computing system; adjusting a number of the target execution units in the stream computing system based on the to-be-adjusted number; and determining, for a target execution unit in at least one target execution unit after the adjustment, an identifier set corresponding to the target execution unit, an identifier in the identifier set being used to indicate to-be-processed data; persisting, according to the identifier set to which the identifier of the to-be-processed data generated through running of an upstream execution unit of the target execution unit belongs, the to-be-processed data generated through the upstream execution unit of the target execution unit into a database; and processing, through the target execution unit, the to-be-processed data indicated by the identifier in the corresponding identifier set, wherein determining, for the target execution unit in at least one target execution unit after the adjustment, the identifier set corresponding to the target execution unit, comprises: adjusting a mapping relationship between each target execution unit and each identifier set according to a preset rule, wherein a total number of identifier sets remains unchanged before and after the adjustment of the number of the target execution units in the stream computing system. 6. The apparatus according to claim 5 , wherein after the processing, through the target execution unit, the to-be-processed data indicated by the identifier in the corresponding identifier set, the operations further comprise: sending indication information to the upstream execution unit of the target execution unit through the target execution unit, the indication information being used to indicate the to-be-processed data having been processed by the target execution unit. 7. The apparatus according to claim 6 , wherein the processing, through the target execution unit, the to-be-processed data indicated by the identifier in the corresponding identifier set includes: starting the at least one target execution unit after the adjustment; and receiving and processing, through the started target execution unit, the to-be-processed data that is sent by the upstream execution unit of the target execution unit and has been determined, from the persisted to-be-processed data indicated by the identifier included in the identifier set corresponding to the target execution unit. 8. The apparatus according to claim 5 , wherein the processing, through the target execution unit, the to-be-processed data indicated by the identifier in the corresponding identifier set includes: de-duplicating, according to a historical record of receiving the to-be-processed data by the target execution unit in the stream computing system, the to-be-processed data sent to the target execution unit by the upstream execution unit of the target execution unit; and processing, through the target execution unit, the de-duplicated to-be-processed data indicated by the identifier in the corresponding identifier set. 9. A non-transitory computer readable medium, storing a computer program, wherein the computer program, when executed by a processor, causes the processor to perform operations, the operations comprising: acquiring a to-be-adjusted number of target execution units, the target execution unit referring to a unit executing a target program segment in a stream computing system; adjusting a number of the target execution units in the stream computing system based on the to-be-adjusted number; and determining, for a target execution unit in at least one target execution unit after the adjustment, an identifier set corresponding to the target execution unit, an identifier in the identifier set being used to indicate to-be-processed data; persisting, according to the identifier set to which the identifier of the to-be-processed data generated through running of an upstream execution unit of the target execution unit belongs, the to-be-processed data generated through the upstream execution unit of the target execution unit into a database; and processing, through the target execution unit, the to-be-processe

Assignees

Inventors

Classifications

  • G06F9/5077Primary

    Logical partitioning of resources; Management or configuration of virtualized resources (specific details on emulation or internal functioning of virtual machines G06F9/455) · CPC title

  • involving task migration · CPC title

  • Buffers; Shared memory; Pipes · CPC title

  • G06F9/4881Primary

    Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title

  • Bootstrapping (security arrangements therefor G06F21/57) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11416283B2 cover?
A method and apparatus for processing stream data are provided. The method may include: acquiring a to-be-adjusted number of target execution units, the target execution unit referring to a unit executing a target program segment in a stream computing system; adjusting a number of the target execution units in the stream computing system based on the to-be-adjusted number; determining, for a ta…
Who is the assignee on this patent?
Beijing Baidu Netcom Sci & Tech Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06F9/5077. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 16 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).