Mantis Job Capacity Planning & Sizing


Mantis Jobs consist of one or more independent stages. Since stages are independent of one another, they are also sized independently.

Stages have one or more workers, which are the fundamental unit of parallelism for a stage. Workers execute individual instances of a stage and are each allocated CPU, memory, disk, and network resources. Workers are also isolated from one another, which means they process events independently of each other using their own dedicated resources.

You can horizontally scale your Mantis Jobs by modifying the number of workers of your stages. You can also vertically scale your Mantis Jobs by tuning the number of resources for each worker.

The following sections are guidelines for sizing different types of Mantis Jobs.


You can modify worker resources any time, even after you have already launched the Mantis Job. This is useful in cases where you might want to adjust your Mantis Job to new traffic patterns. You can do this by marking your stage as Stage is Scalable.

You can also mark the stage as AutoScale this stage to have Mantis automatically scale the stage.

You can horizontally scale your Mantis Jobs without restarting the entire job. However, vertically scaling your jobs requires a new job submission for changes to take effect.

Mantis Jobs

In addition to scaling the number of workers of a Mantis Job, you should consider tuning your workers, which have two tunable properties: processing resources and processing parallelism.

For processing resources, you should generally determine if CPU, memory, or network resources will be a bottleneck in your processing. You should increase:

  • the number of CPUs if your job has CPU-intensive transformations such as serialization and deserialization
  • memory if your job plans to hold many objects or large objects in memory
  • network resources if your job needs high throughput for network I/O such as external API calls

For processing parallelism, you must choose between serial or concurrent input processing for each stage.

Serial input processing means your Processing Stage will receive and process events within a single thread. With serial input, you lose parallelism but your processing logic becomes straightforward without race conditions.

// Specifying serial input for a stage.
public class SerialStage implements ScalarComputation<T1, T2> {
    static public ScalarToScalar.Config<T1, T2> config() {
        return new ScalarToScalar.Config<T1, T2>().serialInput();

Concurrent input means your Processing Stage will have multiple threads which each receive and process events. With concurrent input enabled, race conditions must be considered because the stage’s threads operate independently from one another and may process events at different speeds.

// Specifying concurrent input for a stage.
public class SerialStage implements ScalarComputation<T1, T2> {
    static public ScalarToScalar.Config<T1, T2> config() {
        return new ScalarToScalar.Config<T1, T2>().concurrentInput();


Workers use serial input by default.

Kafka Source Jobs

Kafka Source Jobs are Mantis Jobs that share the same properties described above, except Kafka Source Jobs use concurrent input processing by default. There are additional job parameters to consider when tuning Kafka Source Jobs: numConsumerInstances and stageConcurrency.

The numConsumerInstances property determines how many Kafka consumers will be created for each worker. For example, if you have numConsumerInstances set to 2 and have 5 workers, then you will have 10 Kafka consumers in total for your Mantis Job consuming from a Kafka topic.

The stageConcurrency property determines a pool of threads which receive events by the Kafka consumers. You can control your processing parallelism with this property. For example, if you have numConsumerInstances set to 2 and stageConcurrency set to 5 on a worker, then two Kafka consumers will read events from a Kafka topic and asynchronously send them to a pool of 5 processing threads.

There are three considerations for using numConsumerInstances and stageConcurrency to tune your Kafka Source Job.

First, you can pin numConsumerInstances to 1 and add more workers to load balance Kafka consumers across worker instances.

If you find that your workers are under-utilized, you can increase numConsumerInstances for each worker. You can also give each worker more CPU, memory, and network resources and increase numConsumerInstances further so that you have fewer workers doing more work.

Lastly, if you find that your Kafka topic’s lag is increasing, then processing might be a bottleneck. In this case, you can increase stageConcurrency to increase processing throughput.


Adding more workers to scale your Kafka Source Jobs to increase throughput or address lag has a hard upper limit. Ensure that you do not have more Kafka consumers than the number of partitions of the Kafka topic that your job is reading from. Specifically, ensure that:

(numConsumerInstances × numberOfWorkers) ≤ numberOfPartitions

This is because the number of partitions is a Kafka topic’s upper bound for parallelism. If you exceed this number, then you will have idle consumers wasting resources which means adding more workers to your Kafka Source Job will not have any positive effect.