This tutorial provides an example that will showcase the options available in a swarm definition. For those less familiar, a Swarm is a directed acyclic graph (DAG) and it is defined in a yaml file and follows a certain syntax.
The names correspond to the various resources that you are creating. They need to follow Kubernetes naming rules.
Bee types and specs
Each Bee is required to have a
type. Types provide the mechanism to re-use Bees across Swarms. Bees can be configured using the
spec block. The fields in the
spec block are specific to the type of bee.
Python bee configuration
This example shows the
python bee type. It requires the following
image- the docker image to use. It is required to include the
file- this is the python file that will run in the Bee. This file will be loaded during runtime and each function decorated with the
@register_beedecorator will be imported.
bee- The Bee function that will be called on each swarm event. The name corresponds to the
nameargument passed to
This clause defines how the Bee will be triggered and where it will receive data from.
gateway input type means that this bee will consume requests or submits passed to a bytewax
gateway subject. The name corresponds to the subject and this is the argument that would be passed to either the
gateway.submit sdk methods in an application.
bee input type designates that the data will come from another Bee in the same swarm. The Bee listed as the
name is required to use the
swarm.publish sdk method in order for the data to be received.
The recommended way to configure bees at runtime is by using environment variables. These can be literal environment variable values or they can leverage Bytewax secrets. More on that below.
This line will populate a Bees environment variable
LITERAL_ENV with the value
To pass secret values like a database password or API token, Bytewax provides secret management. You can manage your secrets in the Bytewax dashboard, via the CLI tool
waxctl or via the REST API. Once you create a secret, it can be passed through as an environment variable with as seen in this example with the
Bytewax provides two ways to scale bees - replicas and sizes. Replicas allow you to scale bees horizontally (replicate the number of bees running), whereas the size allows you to scale vertically, increasing CPU and Memory available to a bee. The best practice is to scale horizontally wherever possible, but the size should be just enough to fit the runtime objects (model, batch etc.)
Bee replicas let you define how many bee processes are spawned. Because replicas will be spread across multiple servers, they also provide resiliency for physical crashes of a single underlying server. It is recommended to have at least 2 replicas at all times for redundancy and the default is 3.
Bee size provides the means of vertically scaling - that is, how much memory and CPU time is reserved for each Bee replica. This should be configured to a safe minimum required for the process running. It’s generally better to have more smaller bees than a few large ones. A fraction of a CPU core means that one physical CPU core can be shared between multiple processes.
The current available sizes are shown below. If you need something else for your particular swarm, please contact us.
- memory: 500MB
- cpu: 1/8 core
- memory: 1GB
- cpu: 1/4 core
- memory: 4GB
- cpu: 1
- memory: 8GB
- cpu: 2