Concurrency & Ingress Configuration

This chapter introduces how to configure concurrency, scaling, and ingress behavior in FrameX using Ray.
By tuning these parameters, you can control instance count, request concurrency, and overall throughput.

1) Overview

FrameX supports additional Ray Serve ingress configurations to improve:

Maximum concurrent requests
Request queueing behavior
Instance-level load control

These configurations are applied through ingress_config and can be defined at different levels with clear inheritance rules.

This chapter uses max_ongoing_requests as an example.

2) Base Ingress Configuration

FrameX provides a global base configuration:

base_ingress_config = {"max_ongoing_requests" = 10}

Behavior

Acts as the default ingress configuration
Automatically inherited by:
- Server
- All plugins
Can be overridden at lower levels

3) Server-Level Ingress Configuration

You can override the base configuration at the server level:

[server]
ingress_config = {"max_ongoing_requests" = 60}

Behavior

Applies to server-level ingress
Overrides base_ingress_config

4) Plugin-Level Ingress Configuration

Plugins can define their own ingress configuration.

Example for the proxy plugin:

[plugins.proxy]
ingress_config = {"max_ongoing_requests" = 60}

Behavior

Takes precedence over both:
- server.ingress_config
- base_ingress_config
Only applies to the specific plugin

5) Plugin Development: Custom Ingress Configuration

If a plugin requires a custom max_ongoing_requests or other Ray Serve parameters, follow these steps.

1. Add ingress_config to Plugin Config

from pydantic import BaseModel
from typing import Any


class ExamplePluginConfig(BaseModel):
    ingress_config: dict[str, Any] = {"max_ongoing_requests": 60}

2. Apply ingress_config During Registration

Use the on_register decorator to inject ingress parameters:

@on_register(**settings.ingress_config)
class ExamplePlugin(BasePlugin):
    def __init__(self, **kwargs: Any) -> None: ...

Behavior

settings.ingress_config is passed directly to Ray Serve
Allows fine-grained control per plugin
Fully compatible with Ray Serve autoscaling and concurrency features

6) Configuration Inheritance Rules

Ingress configuration follows a top-down inheritance model:

base_ingress_config
plugins.<plugin_name>.ingress_config

If a plugin does not define ingress_config, it automatically inherits from the nearest parent level.

Complete Example: config.toml

base_ingress_config = {"max_ongoing_requests" = 10}

[server]
ingress_config = {"max_ongoing_requests" = 60}

[plugins.proxy]
ingress_config = {"max_ongoing_requests" = 60}

7) Supported Ray Serve Parameters

In addition to max_ongoing_requests, Ray Serve supports many advanced parameters such as:

Autoscaling behavior
Replica scaling limits
Request queue management

For the complete and up-to-date list, refer to the official Ray Serve documentation:

https://docs.rayai.org.cn/en/latest/serve/advanced-guides/advanced-autoscaling.html

FrameX: The Plug-In Paradigm for Algorithmic Development