Secure Load Balancing
Secure load balancing allows you to deploy multiple agent replicas behind a sentinel and serve them securely with channel encryption and sticky routing. This ensures that encrypted sessions are maintained even when scaling horizontally.
Secure load balancing requires the BSL-licensed naylence-advanced-security package
and additional configuration beyond the standard SDK presets.
Overview
When you scale agents horizontally (multiple replicas serving the same address), you need to ensure that:
- Channel encryption works: Multi-message key exchange reaches the same replica
- Session state is maintained: Stateful conversations stay on one replica
- Security is preserved: No compromise when load balancing
The AFTLoadBalancerStickinessManager solves this by pinning encrypted channels to specific replicas.
Why Stickiness Matters for Channel Encryption
Channel encryption is a multi-message interaction:
1. Client → Sentinel → Replica: "I want to set up an encrypted channel"
2. Replica → Sentinel → Client: "Here are the channel keys"
3. Client → Sentinel → Replica: "Here's my encrypted data"
4. (Subsequent messages use the same channel)The problem: If request 3 goes to a different replica, it won’t have the channel keys from step 2.
The solution: Stickiness ensures all messages in a channel flow go to the same replica.
Architecture
┌──────────────────────────────────────────────────────────────────┐
│ Sentinel (strict-overlay) │
│ AFTLoadBalancerStickinessManager │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Stickiness Logic: │ │
│ │ 1. Channel setup → picks a replica │ │
│ │ 2. All subsequent channel messages → same replica │ │
│ │ 3. If replica fails → re-establish on new replica │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
└──────────┬───────────────────────────────────────┬───────────────┘
│ │
│ Sticky Channel │ Sticky Channel
▼ ▼
┌────────────────┐ ┌────────────────┐
│ Math Agent │ │ Math Agent │
│ Replica 1 │ │ Replica 2 │
│ *.fame.fabric │ │ *.fame.fabric │
└────────────────┘ └────────────────┘Configuration
Secure load balancing requires two key configurations:
1. Agent: Wildcard Logical Domain
Agents must request a wildcard logical domain to enable replica fan-out:
# config/agent-config.yml
node:
type: Node
requested_logicals:
- "*.fame.fabric" # Wildcard enables load balancing!
security:
type: SecurityProfile
profile: "${env:FAME_SECURITY_PROFILE}"
admission:
type: AdmissionProfile
profile: "${env:FAME_ADMISSION_PROFILE}"The wildcard *.fame.fabric tells the fabric that multiple nodes may advertise
services under this domain. Each replica can then serve math@fame.fabric.
2. Sentinel: Stickiness Manager
The sentinel must be configured with the AFTLoadBalancerStickinessManager:
# config/sentinel-config.yml
node:
type: Sentinel
id: "${env:FAME_NODE_ID:}"
public_url: "${env:FAME_PUBLIC_URL:}"
listeners:
- type: WebSocketListener
port: "${env:FAME_SENTINEL_PORT:8000}"
requested_logicals:
- fame.fabric
security:
type: SecurityProfile
profile: "${env:FAME_SECURITY_PROFILE:open}"
admission:
type: AdmissionProfile
profile: "${env:FAME_ADMISSION_PROFILE:none}"
storage:
type: StorageProfile
profile: "${env:STORAGE_PROFILE:memory}"
# Enable advanced stickiness
stickiness:
type: AFTLoadBalancerStickinessManager
security_level: strictKey configuration:
stickiness.type:AFTLoadBalancerStickinessManager(BSL feature)stickiness.security_level:strict(enforces identity-aware stickiness)
Environment Variables
# Security profile (must be strict-overlay for channel encryption)
FAME_SECURITY_PROFILE=strict-overlay
# Encryption level (channel requires stickiness)
FAME_DEFAULT_ENCRYPTION_LEVEL=channel
# Admission profile (welcome for advanced security)
FAME_ADMISSION_PROFILE=welcomeHow It Works
Initial Request Flow
- Client calls
math@fame.fabricwith channel encryption enabled - Sentinel sees multiple replicas available (Replica 1 and Replica 2)
- Stickiness manager picks one (e.g., Replica 1) and records the binding
- Channel setup completes with Replica 1
- All subsequent messages for this channel go to Replica 1
Failover
If the pinned replica becomes unavailable:
- Sentinel detects the replica is down
- Next request is routed to a healthy replica (e.g., Replica 2)
- New channel is established with Replica 2
- Stickiness is updated to pin to Replica 2
Verifying Stickiness
You can verify that stickiness is working by watching replica logs:
Terminal 1: Watch Replica 1
docker compose logs -f math-agent-replica1Terminal 2: Watch Replica 2
docker compose logs -f math-agent-replica2Terminal 3: Run Client
make runExpected behavior: Only one replica should show activity for all client requests. Re-running the client may establish a new channel on either replica, but all messages within a session go to the same one.
Testing Failover
# Stop replica 1
docker compose stop math-agent-replica1
# Run client again — should work, now using replica 2
make run
# Restart replica 1
docker compose start math-agent-replica1Custom Configuration Files
Secure load balancing requires custom YAML configuration files (not the standard SDK presets). These must be mounted into your containers:
Docker Compose Example
services:
sentinel:
image: naylence/agent-sdk-adv-python
volumes:
- ./config/sentinel-config.yml:/etc/fame/fame-config.yml
environment:
- FAME_SECURITY_PROFILE=strict-overlay
- FAME_ADMISSION_PROFILE=welcome
math-agent-replica1:
image: naylence/agent-sdk-adv-python
volumes:
- ./config/agent-config.yml:/etc/fame/fame-config.yml
environment:
- FAME_SECURITY_PROFILE=strict-overlay
- FAME_ADMISSION_PROFILE=welcome
math-agent-replica2:
image: naylence/agent-sdk-adv-python
volumes:
- ./config/agent-config.yml:/etc/fame/fame-config.yml
environment:
- FAME_SECURITY_PROFILE=strict-overlay
- FAME_ADMISSION_PROFILE=welcomeSealed vs Channel Encryption with Stickiness
| Encryption Mode | Stickiness Required? | Why |
|---|---|---|
sealed | No | Each message is independently encrypted |
channel | Yes | Multi-message key setup requires same replica |
plaintext | No | No encryption, stickiness optional for state |
If you don’t need stickiness, use sealed encryption instead of channel.
Sealed encryption works with any load balancing strategy.
When to Use Secure Load Balancing
Good fit:
- High-throughput agents that need horizontal scaling
- Stateful conversations that must stay on one replica
- Channel encryption with multiple replicas
- Enterprise deployments with zero-trust requirements
Consider alternatives when:
- Single replica is sufficient
- Stateless agents (sealed encryption works without stickiness)
- Development/testing environments
Running the Example
Complete runnable examples are available in both repositories:
# TypeScript
cd naylence-examples-ts/examples/security/stickiness
make start # Start sentinel + 2 replicas
make run # Run the client
# Python
cd naylence-examples-python/examples/security/stickiness
make start # Start sentinel + 2 replicas
make run # Run the clientWhat the Example Includes
- Caddy: TLS reverse proxy
- OAuth2 Server: Development token issuer
- Welcome Service: Admission with placement
- CA Service: SPIFFE/X.509 certificates
- Sentinel: With AFTLoadBalancerStickinessManager
- Math Agent Replica 1: First replica with
*.fame.fabric - Math Agent Replica 2: Second replica with
*.fame.fabric - Client: Channel encryption with sticky routing
Troubleshooting
Requests Bounce Between Replicas
- Verify agent config has
requested_logicals: ["*.fame.fabric"] - Verify sentinel config has
stickiness: { type: AFTLoadBalancerStickinessManager } - Ensure custom YAML is mounted at
/etc/fame/fame-config.yml
No Encryption Visible
- Verify
FAME_DEFAULT_ENCRYPTION_LEVEL=channel - Use the advanced Docker image (
naylence/agent-sdk-adv-*) - Run
make run-verboseto inspect envelopes
Agents Fail to Attach
- Check startup order (sentinel must be healthy before agents)
- Verify Welcome/CA service endpoints
- Check SSL certificate trust chain
Next Steps
- Back to Advanced Security → Advanced Security
- Back to overview → Security