Frequently Asked Questions

Frequently asked questions about Naylence.

Getting started and mental model

What problem does Naylence solve in one sentence?

Short answer: Naylence lets you build multi-agent systems where agents can move, scale, and run across environments without changing client code, while the fabric handles the messaging infrastructure (routing, transports, reliability, and security hooks).

When should I not use Naylence?

Short answer: Don’t use Naylence if you don’t need a fabric.

Common cases where it’s probably overkill:

You have a single agent in a single process and a plain HTTP API is enough.
You don’t need agents to call each other, stream, or run long-lived sessions.
Your system already has a mature messaging layer (and you’re not looking for A2A-compatible agent semantics + topology/routing on top).

Is Naylence a framework, a protocol, or a runtime?

Short answer: It’s primarily a runtime (fabric) plus SDKs.

Protocol/semantics: the agent API is A2A-based (tasks/messages/artifacts).
Runtime/fabric: Naylence runs the network of nodes that routes envelopes over transports (WebSocket/HTTP, etc.).
Framework: it gives you conventions and helpers, but it doesn’t force an LLM/tool framework—you bring your own.

What’s a “fabric” and what’s a “biome”?

Fabric: the whole messaging network formed by nodes (including Sentinels) that can route envelopes between agents.

Biome: a local trust domain anchored by a Sentinel: everything that attaches “under” that Sentinel.

Mental model (metaphor):

A biome is to a fabric what a home network is to the internet.
A sentinel to a fabric what a home router/modem is to the Internet.

In other words: a biome is your local network segment; the fabric is the wider network that biomes connect into.

Important nuance: biomes can be hierarchical (nested). A biome can contain sub-biomes (for example, a top-level Sentinel acting as a backbone, with downstream Sentinels anchoring smaller trust domains).

Is a node the same thing as a process/container?

Short answer: No. A node is a logical endpoint, not a deployment unit.

One process can host multiple nodes (common in demos, browser setups, tests).
In production, you’ll often run one node per process/container for isolation and scaling—but that’s an operational choice, not a requirement.

Architecture and topology

Why use a fabric at all? Why not connect agents directly?

Short answer: Direct connections work for small demos, but a fabric gives you stable addressing, routing, and policy so your system can grow without rewriting every integration.

With direct agent-to-agent links, every agent (and client) needs to know:

where every other agent lives (URLs, credentials, network rules),
how to reconnect/fail over,
how to handle scaling and multiple replicas,
and how to enforce consistent security and tenancy boundaries.

With a fabric, you get:

logical addresses (client code doesn’t change when agents move or scale),
routing and load balancing (including optional stickiness),
admission and trust domains (who can join, and what they can reach),
and a single place to evolve transports and reliability behaviors.

Rule of thumb: If you have more than a couple agents, need mobility/scale, or care about consistent security boundaries, use the fabric.

What is the difference between a Sentinel and a regular Node?

Short answer: A Sentinel is just a Node configured to accept inbound connections (via listeners) and route between attached nodes. A “regular” Node usually connects outbound and hosts agents/clients.

Key difference:

Sentinel: has one or more listeners (WebSocket/HTTP/etc.) so other nodes can attach to it.
Node: typically has no listeners; it attaches to a Sentinel upstream.

Mental model: A Sentinel is the router / rendezvous point of a fabric.

Do I really need Sentinels? Can’t I just run agents directly?

Short answer: You don’t need a separate Sentinel at first. The simplest setup is Agent-on-Sentinel (agent and sentinel in the same process), and it’s a perfectly valid way to start.

Why there’s a split at all: An agent is your business logic. A node/sentinel is the messaging layer (connections, routing, security).

When to separate them: Split into a dedicated Sentinel + agent nodes when you need independent scaling, fault isolation, multiple agents behind one entry point, load balancing, or hierarchical/relay topologies.

How do I choose a topology (single process vs 3-node vs multi-sentinel)?

Rule of thumb: start simple, split when you need isolation or scale.

Single process (Agent-on-Sentinel): best for local dev, demos, unit tests, and “get it working fast.”
3-node (Client + Sentinel + Agent node): the smallest deployable setup. Good when you want real networking boundaries and independent restart/scale of the agent.
Multi-sentinel: use when you need federation (multiple trust domains/biomes), edge/region placement, blast-radius isolation, routing across networks, or organizational boundaries.

Can I run Naylence fully in the browser? What are the limits?

Short answer: Yes, you can run a fabric in the browser, but it’s best for demos and lightweight clients.

Typical limits:

Browsers can suspend tabs (timers/network/compute), which hurts long-lived agents and strict timing.
You usually can’t host a true “server-style” inbound endpoint; the browser is mainly an outbound participant.
Secrets and persistence are constrained compared to a backend.
Throughput/CPU/memory are limited; treat the browser as a client node (or a small multi-node demo), not a production router.

How do nodes find each other? Do I need service discovery?

Short answer: In small fabrics, a node only needs the rendezvous point (the URL of a parent Sentinel) and can attach directly. In larger or security-hardened fabrics, nodes bootstrap via an admission (welcome) service.

Two common patterns:

Direct attach (simple setups):
The node is configured with the parent Sentinel address and connects to it directly.
Admission-based attach (recommended at scale / advanced security):
The node first connects to an admission/welcome service, which:
- decides the node’s placement in the fabric (which Sentinel / where it should attach),
- issues an admission token,
- and returns the connection info the node should use.

Rule of thumb: If you’re using the advanced security model or expect the fabric to grow, prefer the admission service pattern.

Can I connect multiple sentinels together? What’s the point?

Short answer: Yes. Multiple sentinels let you compose bigger fabrics out of smaller biomes.

Common reasons:

Segmentation: separate trust domains (teams, environments, tenants).
Edge/region routing: keep traffic local when possible, bridge when needed.
Resilience: reduce blast radius; a problem in one biome doesn’t take down everything.
Network constraints: connect biomes across NAT/firewalls with controlled links.

When do I need a peer-to-peer topology?

Short answer: Most commonly, you use peer-to-peer to enable a federated fabric—multiple trust domains (biomes) that need to communicate across boundaries.

Why peer-to-peer helps in federation:

It allows direct links between biomes/sentinels once policies/admission allow it.
It avoids forcing all cross-domain traffic through a single “global hub.”
It keeps trust domains independent while still enabling controlled interop.

Other reasons to choose peer-to-peer:

Lower latency / higher throughput between specific nodes
Large payloads/streams where relaying would be a bottleneck
Reduce load on a central Sentinel (use it as rendezvous + policy, not the data plane)

Rule of thumb: If you have multiple trust domains and want a federated fabric, peer-to-peer is usually the right topology.

What are Naylence plugins? Why is everything a plugin?

Short answer: Plugins are how Naylence stays modular. Most runtime components are DI/pluggable, and related components are packaged together as a plugin.

This is not just an extension mechanism—core Naylence is built this way:

the Naylence runtime is a plugin
the Naylence agent SDK is implemented as a plugin
Naylence advanced security is also delivered as a plugin
browser variants follow the same model

Browser caveat: in browser builds, you may need explicit imports from the root packages so bundlers don’t tree-shake plugins that are only referenced indirectly.

Mental model: plugins are “feature packs” that register swappable implementations (connectors, listeners, policies, security, telemetry, etc.) into the fabric.

Does the client need to know where an agent runs?

Short answer: No. Clients should address agents by logical address.

The fabric (via the Sentinel and routing) resolves that address and forwards messages, so your client code doesn’t change when an agent moves, scales out, or is redeployed elsewhere.

Can I host tools/services in the fabric, not just agents?

Short answer: Yes. Naylence can host general-purpose fabric services (RPC-style “tools”) in addition to agents.

Fabric services are lightweight and easy to call from anywhere in the fabric (often simpler than MCP for internal tooling).
In fact, agents themselves are built on top of these generic fabric services; the agent layer is a higher-level API over the same underlying service mechanism.

Rule of thumb: Use fabric services for internal tools/RPC utilities. Use agents when you want the full A2A task/message model (streaming, artifacts, orchestration patterns). Use MCP for external MCP tools.

Addressing and routing

Why do I need routing at all? If I connect directly, there’s nothing to route.

Short answer: Routing is how you keep who to talk to stable while everything underneath changes.

Direct connect is “no routing” only if all of this stays true forever:

the agent has a single fixed endpoint,
there’s no horizontal scaling,
no failover,
no mobility (moving agents between machines/regions),
no segmentation (tenants/trust domains),
no policy-based placement (where a node is allowed to attach).

As soon as any of those change, you need a way to decide:

which replica should handle this request,
where the agent currently lives,
whether the caller is allowed to reach it,
and how to fail over when something disappears.

A fabric’s routing layer makes those decisions once, consistently, so clients keep calling the same logical address while the system scales, moves, and evolves.

What is a “logical address” and how is it resolved?

A logical address is a stable identifier you use to reach an agent without caring where it runs.

It looks like an email address, for example:

math-agent@fame.fabric

Where:

math-agent is typically the agent name
fame.fabric is a trust domain (not a real DNS name)

For simple setups you usually use a single trust domain (often defaulting to fame.fabric). In federated setups you’ll have multiple trust domains, and the domain part becomes an explicit boundary for routing and policy.

How it’s resolved: when you send a task/message to a logical address, the fabric (typically via the Sentinel) resolves that address to a concrete destination (a specific agent instance on a specific node) and forwards the envelope there.

What is a “physical address”? Why does it look so scary?

A physical address is the concrete, routable destination the fabric uses internally once a logical address has been resolved to a specific node/replica.

It has this shape:

<name>@<trust-domain?>/<physical-node-path>

<name>: often an internal endpoint/handler name (you’ll see things like __rpc__...).
<trust-domain?>: optional when you’re inside a single trust domain. That’s why you’ll sometimes see @/… (meaning “default/current trust domain”).
<physical-node-path>: an opaque path that encodes where the node sits in the fabric (typically a chain of node IDs/hops).

Example (opaque by design):

__rpc__nXtGirzxpAOAu2ov@/wjNMvor5hs6nSb2G/GeDhbRy14ksOBuH2

When do I use it? Almost never directly. Physical addresses are mainly for:

routing under the hood (getting to the exact node/replica),
debugging/observability (envelope inspector, logs),
reply paths (the fabric knows where to send responses).

Rule of thumb: Use logical addresses in your code; treat physical addresses as internal, ephemeral routing artifacts that can change when placement/reconnects change.

How does load balancing work (round-robin, least-load, sticky sessions)?

Short answer: Naylence load balancing is a strategy that picks one downstream segment/replica from a pool for each routed envelope. Built-in strategies today are random, round-robin, and HRW (Highest Random Weight) (plus sticky + composite fallback).

Supported strategies:

Round-robin: maintains a counter per pool key and cycles through available segments.
Random: picks a random segment from the pool.
HRW (Highest Random Weight): picks the segment with the highest hash weight for (segment, salt). By default the salt is envelope.id, so the choice is effectively “random per envelope”. If configured with sticky_attribute, the salt is taken from that envelope field (falling back to envelope.id).

Profiles: development defaults to round-robin (predictable), and there are also random, round_robin, hrw, and sticky-hrw (HRW with sticky_attribute: session_id).

Note: “least-load / least-connections” is not implemented in this strategy set (at least in this codebase).

Do you support sticky routing (session affinity)?

Short answer: Yes. Naylence supports sticky routing so messages from the same conversation/session can be routed to the same agent replica.

There are two levels:

Basic stickiness: deterministic routing based on a stable envelope attribute (for example a session_id), so the same key tends to map to the same replica.
Advanced stickiness (Advanced Security): stronger session affinity and policy-aware placement, intended for larger fabrics and stricter security models.

Rule of thumb: Use stickiness when your agent keeps in-memory per-session state. If your agent is stateless, prefer non-sticky load balancing.

Security Model

What is “overlay security” in Naylence?

Short answer: Overlay security is Naylence’s security layer above the underlying transport (WebSocket/HTTP). It provides consistent admission controls and message provenance across the fabric, regardless of how nodes are connected.

In the OSS packages, overlay security focuses on:

Controlled admission (the “gate”): only allowed nodes can join the fabric.
Signed provenance: envelopes can be signed so recipients can verify origin and detect tampering.

Can overlay security replace TLS?

Short answer: No. Overlay security is not a replacement for TLS, it complements it.

TLS secures the transport channel (protects data in transit, authenticates endpoints at the connection layer).
Overlay security adds fabric-level guarantees such as controlled admission, message provenance (signatures), and (in advanced security) end-to-end sealed payload encryption and stronger identity/session controls.

Rule of thumb: Use TLS for all connections. Use overlay security to enforce identity, policy, and (optionally) end-to-end confidentiality across the fabric.

Do I need the advanced security package for production?

Short answer: Not necessarily. For many production deployments, the OSS gated security profile (controlled admission + signed provenance) is secure enough.

Use the BSL advanced security package when you need stronger guarantees, such as:

SPIFFE-style identities and tighter identity lifecycle controls (via a CA service)
Sealed (end-to-end) payload encryption and stronger channel encryption guarantees
More secure stickiness/session affinity under stricter policy constraints
Harder multi-tenant / zero-trust environments and compliance-driven requirements

How do I secure node admission (who is allowed to connect)?

Short answer: You secure the fabric by controlling node admission. A node must be admitted before it can attach to a Sentinel (directly or via a welcome/admission service), and admission is where you enforce “who is allowed in” and what they’re allowed to do.

Common controls include:

requiring an admission token
enforcing policy at the Sentinel/admission service (trust domain, environment, tenant, etc.)
issuing short-lived credentials only after successful admission

What has a verifiable identity: nodes or agents?

Short answer: Identity is tied to the hosting node.

A node has the secure, verifiable identity.
Agents hosted on the same node share that identity, even if they have different logical addresses.
If you want an agent to have its own strong identity boundary, the recommended pattern is one agent per dedicated node (dedicated-node topology).

Can I do end-to-end encryption? Who can see envelope payloads?

Short answer: Yes, with the advanced security package.

In the core model, you typically rely on secure channels between nodes (so data is protected in transit), and the nodes that terminate those channels can read the payload.
With advanced security, you can use end-to-end sealed payload encryption, so only the intended recipient can decrypt the envelope payload (intermediaries can still route it).

How do multi-tenant boundaries work?

Short answer: Tenancy boundaries are typically modeled as trust domains and enforced through admission + policy.

Common patterns:

Separate trust domains / biomes per tenant (often anchored by separate Sentinels) for clear isolation.
Policy-based routing and admission rules so tenants can’t attach to or address resources outside their boundary.

How do I rotate keys / certificates?

Short answer: You usually don’t rotate them manually.

Certificates are short-lived by default (for example, 24 hours), and nodes renew automatically as part of the node admission / re-admission protocol.

What’s the “advanced security package” and what’s in core?

Core: the default security and topology primitives (admission hooks, secure transports, policy points, basic stickiness/routing).

Advanced security (high level):

SPIFFE-style identities/certificates
a (dev) CA service
end-to-end sealed payload encryption and channel encryption
more secure sticky sessions and related policy/placement controls

Protocols and transports

Why is WebSocket the primary protocol for connecting nodes?

Short answer: WebSocket is the default because it gives a single persistent, full-duplex connection, so both the node and sentinel can send envelopes, acks, and heartbeats at any time with low latency and no per-message connection overhead.

What about HTTP? HTTP is request/response, so “server push” usually means polling/long-polling. Naylence supports HTTP connectors as a fallback for restricted networks or serverless, but WebSocket is the path with the best real-time behavior.

Rule of thumb: Use WebSocket unless you can’t; use HTTP when you must accept higher latency/overhead.

Do you have a RESTful API for Naylence?

Short answer: Not currently. Naylence’s primary interface is an envelope-based protocol (usually over WebSocket) because it supports streaming, callbacks/progress, and bidirectional messaging without extra glue.

If you need HTTP today: Use the HTTP connector, which runs over two half-duplex HTTP connections (one outbound, one inbound) to approximate a persistent, bidirectional link in environments where WebSocket isn’t ideal.

Roadmap: A thin REST gateway for “thin clients” is planned, translating REST calls into Naylence envelopes.

Can I use a message broker like Kafka or Redis with Naylence?

Short answer: Not out of the box. Naylence doesn’t ship Kafka/Redis transports today, but the system is plugin-based and its networking is fully configurable via connectors and listeners, so it’s straightforward to add broker-backed transports if you need them.

In many deployments, you won’t need an external broker because the built-in transports already include fabric-level behaviors like flow control/backpressure and are designed for real-time, bidirectional messaging.

Rule of thumb: Start with the built-in transports; add a broker-backed connector/listener only when you have a concrete requirement for it.

LLMs

What about LLMs? I don’t see any LLM-specific features in the SDK.

Short answer: That’s by design. Naylence is an agent framework, not an LLM framework, so you bring your own LLM library (OpenAI, Anthropic, LangChain, etc.), and Naylence handles the messaging/runtime side around it.

Naylence handles the messaging infrastructure; you handle the intelligence.

Why no built-in LLM abstractions: LLM APIs change fast and we avoid provider lock-in; your agent might not even use an LLM.

What Naylence gives LLM agents: streaming responses, async/long-running tasks + progress callbacks, stateful sessions + correlation, tool/agent orchestration, and scaling behind a sentinel.

Where to look: See the runnable LLM examples under examples/llm/* in the TS and Python example repos.

How do I call an LLM from an agent running in the browser?

Short answer: Don’t call the LLM provider directly from the browser. Keep provider API keys server-side and have the browser agent call a backend LLM agent/service through the fabric.

Recommended pattern:

Browser agent (no secrets) → sends a task to → Backend “LLM gateway” agent
The backend agent holds the LLM provider credentials, applies policy/rate limits, and returns the result (optionally streaming) back to the browser.

Rule of thumb: Browsers should never contain long-lived LLM tokens. Treat the browser as an untrusted client and route LLM access through a server-side node.

Standards and interoperability

Do you support A2A (Agent-to-Agent)?

Short answer: Yes! Our core agent API is based on the A2A standard. We implement A2A over our primary transports and also provide simpler APIs for common use cases.

A2A at the Core

The A2A (Agent-to-Agent) protocol defines a standard way for agents to communicate. Naylence’s agent API is built on this foundation:

A2A Concept	Naylence Implementation
Agent Card	Agent metadata and capabilities discovery
Tasks	The primary unit of work between agents
Messages	Structured communication within tasks
Artifacts	Data produced by agents during task execution
Streaming	Native support via Fame protocol

A2A Over Any Transport

A2A doesn’t strictly prescribe a transport layer—it focuses on the message semantics. Naylence provides A2A over both of our primary transports:

Transport	A2A Support	Best For
WebSocket	✓ Full	Real-time agents, streaming, bidirectional
HTTP	✓ Full	Serverless, simple request-response, firewalls

This means you get A2A compatibility regardless of your deployment constraints.

Simpler APIs When You Need Them

While A2A provides a comprehensive standard, it can be verbose for simple use cases. Naylence also offers streamlined APIs:


from naylence.agent import TaskSendParams, Message, TextPart
 
# Full A2A-style task submission
params = TaskSendParams(
    id="task-123",
    message=Message(
        role="user",
        parts=[TextPart(text="Hello!")]
    )
)
task = await remote_agent.start_task(params)
 
# Simplified API for common patterns
response = await remote_agent.run_task("Hello!")

Both API styles use the same underlying A2A semantics. The simplified API is syntactic sugar that constructs proper A2A messages under the hood.

Why not just use A2A directly?

Short answer: You can (and Naylence does). A2A is the interoperability contract; Naylence is the production fabric around it.

A2A standardizes how agents talk (cards, tasks, messages, artifacts). But it intentionally doesn’t provide the “runtime plane” you usually need in production: routing/topology, security & policy, durability, flow control, and operational tooling.

A concrete example: with plain A2A you’ll often end up rebuilding an address/routing layer so client code doesn’t change when you move an agent, scale it out, or promote it to production.

Bottom line: Naylence lets you keep A2A semantics while taking care of the infrastructure so you can focus on agents, not plumbing.

Do you support MCP (Model Context Protocol)?

Short answer: Yes. Naylence agents are tool-agnostic, so you can use the official MCP SDK inside any agent to call MCP tools/servers.

We also have experimental MCP helpers in the runtime, but they’re not required and not the default path today (most integrations should use the MCP SDK directly).

Bottom line: Naylence handles the messaging/lifecycle; you choose how your agent talks to tools, MCP or otherwise.

Reliability

What delivery guarantees does Naylence provide?

Short answer: Naylence provides at-least-once delivery today (retries + acknowledgements), so messages may be delivered more than once. Exactly-once semantics are on the roadmap.

What at-least-once means: A sender retries until it receives an ACK. If a retry happens, the receiver may see duplicates.

What you should do: Design handlers to be idempotent where possible, and use a deduplication key/message id to ignore repeats when side effects matter.

Note: Naylence also includes flow control/backpressure to stay stable under load—see the reliability guide for details.

DevEx and debugging

How do I debug message flow? Can I inspect envelopes?

Short answer: Yes. Turn on detailed logging and (optionally) print every envelope.

Set FAME_LOG_LEVEL on a node to enable very detailed runtime logging.
Set FAME_SHOW_ENVELOPES to print all inbound and outbound envelopes for that node.

This is usually the fastest way to understand routing, retries/ACKs, and what payloads are actually moving through the fabric.

Do you have tracing/metrics/logging built in?

Short answer: Yes. Naylence integrates with OpenTelemetry, and you can plug in your own telemetry provider.

See examples/monitoring/open-telemetry (Python and TypeScript) for runnable setups.
If you already have an observability stack, you can integrate it via the Naylence plugin system.

How do I test agents locally without running a whole fabric?

Short answer: Use the single-process topology (Agent-on-Sentinel) for local development and most tests.

You get realistic envelope flow and routing behavior without external networking dependencies. When you need real network boundaries, move to a 3-node setup.

Deployment and ops

Does this run on Kubernetes? Docker Compose? Cloudflare?

Short answer: Yes—Naylence nodes are just processes, so you can run them anywhere you can run your runtime (Python/Node.js), including Docker Compose and Kubernetes.

Cloudflare: it depends what you mean by “run”:

Client / browser nodes: hosting a browser-only fabric/client on Cloudflare Pages absolutely counts as “running on Cloudflare” (it’s running in users’ browsers, served by Cloudflare).
Server-side nodes: you can also run WebSocket backends on Cloudflare Workers, and if you need long-lived, stateful WebSocket coordination, Durable Objects are Cloudflare’s intended building block for that.

How do I scale from laptop → production?

Rule of thumb: scale topology in steps:

Single process (Agent-on-Sentinel) for local dev and tests.
3-node (client + sentinel + agent) to introduce real network boundaries.
Replicate agents and enable load balancing for throughput and resilience.
Add an admission/welcome service when the fabric grows (especially with advanced security) to centralize placement + issuing admission tokens.

What state does a sentinel store (if any)?

Short answer: mostly soft state.

A Sentinel typically holds in-memory/ephemeral state such as:

attached node sessions/connections
routing/address bindings and related caches
stickiness/session-affinity helpers (if enabled)

Your durable application state should live in your agents and their backing stores. (Security services like a CA/admission service may keep their own state depending on your setup.)

How do upgrades work—can I roll without downtime?

Agents: yes—replicas + load balancing is the main “no downtime” tool:

bring up new replicas (v2), shift traffic gradually (canary/blue-green), then drain old replicas.

Sentinels / fabric layer: today, the clean approach is redundancy + controlled placement:

run multiple Sentinels and use the admission/welcome service to steer new attachments and help with draining during upgrades,
or put a standard load balancer/DNS strategy in front of Sentinel listener endpoints (depending on your topology and operational constraints).

So: you’re not missing anything—agent replication is the core zero-downtime story, and fabric-level no-downtime comes from adding redundancy and admission-based placement.