Observability and Debugging Edge Middleware

A middleware chain that runs correctly on your laptop can fail in ways you never see in production. Each stage executes inside an ephemeral V8 isolate at a Point of Presence thousands of kilometers from your terminal, with no attached debugger, no persistent disk to tail, and no shared process memory to inspect. When a request slows down or returns the wrong status, the only artifacts you have are the signals the isolate chose to emit before it was torn down. Observability is therefore not an add-on for edge middleware — it is the only window you have into the system.

This guide is part of Building a Custom Middleware Chain, and it covers the three pillars of edge telemetry: distributed tracing with W3C Trace Context, structured logging with sampling and redaction, and resilience instrumentation through latency budgets, error boundaries, and circuit breakers. Every pattern here respects the constraints of the edge runtime: no async hooks, no thread-local storage, a tight CPU budget, and telemetry export that must never block the response.

Why edge observability is different

On a long-lived Node.js server, observability tooling leans on infrastructure that simply does not exist at the edge. AsyncLocalStorage and async_hooks provide implicit context propagation across the call stack. A background agent thread batches and flushes spans. The process lives for hours, so a slightly slow exporter is invisible. None of that holds inside a V8 isolate.

The isolate is created on demand, may serve a single request, and is frozen or discarded immediately after the response resolves. Three consequences follow directly:

Context must be passed explicitly. Without reliable async context storage, the trace context and request ID travel through your chain as fields on the Context object you already thread between stages — not through ambient globals.
Export must be non-blocking and deferred. If you await an OTLP POST before returning the response, you add that round-trip to every request’s latency. Telemetry export belongs in ctx.waitUntil(), which keeps the isolate alive after the Response is sent so the export completes without the user waiting.
CPU and bundle budgets are hard ceilings. Cloudflare Workers on the free tier cap synchronous CPU at 10 ms and the compressed bundle at 1 MB. A full OpenTelemetry SDK with auto-instrumentation will not fit. You instrument with the lightweight @opentelemetry/api surface and a hand-rolled exporter, not the batteries-included Node distribution.

Trace context enters with the request, each stage records a span, and export is deferred through ctx.waitUntil so the response is never blocked on telemetry.

W3C Trace Context propagation

Distributed tracing only works if every hop agrees on a single trace identifier. The W3C Trace Context specification standardizes this through two headers. traceparent carries the version, a 16-byte trace ID, an 8-byte parent span ID, and trace flags, formatted as 00-<trace-id>-<span-id>-<flags>. tracestate carries vendor-specific key-value pairs that ride alongside without breaking interoperability.

Edge middleware sits at the front door, so it is usually the component that either continues an existing trace or starts a new one. The rule is simple: if a valid traceparent arrives, adopt its trace ID and treat its span ID as your parent. Otherwise, mint a fresh trace ID. Then inject an updated traceparent into every outbound subrequest so the origin and downstream services join the same trace.

interface TraceContext {
  traceId: string;   // 32 hex chars
  spanId: string;    // 16 hex chars
  sampled: boolean;
}

function hex(bytes: number): string {
  const buf = new Uint8Array(bytes);
  crypto.getRandomValues(buf);
  return [...buf].map((b) => b.toString(16).padStart(2, "0")).join("");
}

export function readTraceContext(request: Request): TraceContext {
  const header = request.headers.get("traceparent");
  const match = header?.match(/^00-([0-9a-f]{32})-([0-9a-f]{16})-([0-9a-f]{2})$/);

  if (match) {
    return { traceId: match[1], spanId: match[2], sampled: (parseInt(match[3], 16) & 1) === 1 };
  }
  // No valid upstream context: start a new trace.
  return { traceId: hex(16), spanId: hex(8), sampled: true };
}

export function injectTraceContext(headers: Headers, ctx: TraceContext, childSpanId: string): void {
  const flags = ctx.sampled ? "01" : "00";
  headers.set("traceparent", `00-${ctx.traceId}-${childSpanId}-${flags}`);
}

The same traceId becomes the correlation ID for your logs, so a single value ties together spans in your tracing backend and log lines in your log store. Never trust the trace ID for security decisions — it is attacker-controllable — but it is perfect for stitching telemetry.

OpenTelemetry spans per stage

A span represents one unit of work with a start time, end time, status, and attributes. In a middleware chain, the natural span boundary is one stage. Wrap each stage so it opens a span on entry, records the outcome, and closes it on exit — including the error path. Because edge isolates lack async context propagation, pass the active span through the Context object rather than relying on context.with() to flow it implicitly.

import { trace, SpanStatusCode } from "@opentelemetry/api";

export function instrument(name: string, mw: Middleware): Middleware {
  return async (req, ctx, next) => {
    const tracer = trace.getTracer("edge-middleware");
    const span = tracer.startSpan(`mw.${name}`, {
      attributes: { "http.method": req.method, "url.path": new URL(req.url).pathname },
    });
    const begin = performance.now();
    try {
      const res = await next();
      span.setAttribute("http.status_code", res.status);
      span.setStatus({ code: res.status >= 500 ? SpanStatusCode.ERROR : SpanStatusCode.OK });
      return res;
    } catch (err) {
      span.recordException(err as Error);
      span.setStatus({ code: SpanStatusCode.ERROR, message: String(err) });
      throw err;
    } finally {
      span.setAttribute("mw.duration_ms", performance.now() - begin);
      span.end();
    }
  };
}

For a complete walkthrough of wiring the @opentelemetry/api surface, building a minimal OTLP-over-fetch exporter, and flushing it through ctx.waitUntil, see the in-depth guide on instrumenting edge middleware with OpenTelemetry.

Structured logging with sampling

Free-text logs are unqueryable at scale. Emit one JSON object per significant event, with a stable set of required fields so your log backend can index and aggregate them. At minimum, every line should carry the trace ID, the stage name, the elapsed duration, the response status, and a severity. Use console.log — every edge platform captures stdout — but write structured JSON, never interpolated strings.

interface LogFields {
  level: "debug" | "info" | "warn" | "error";
  traceId: string;
  stage: string;
  durationMs?: number;
  status?: number;
  msg: string;
}

export function emit(fields: LogFields): void {
  console.log(JSON.stringify({ ts: new Date().toISOString(), ...fields }));
}

At high request volumes, logging every request overwhelms ingestion pipelines and inflates cost. Apply head-based sampling: keep a small fraction of successful requests but always keep errors. A deterministic check derived from the trace ID keeps the decision consistent across all stages of one request.

function keepLog(ctx: TraceContext, status: number, rate = 0.05): boolean {
  if (status >= 500) return true; // never drop server errors
  return parseInt(ctx.traceId.slice(0, 4), 16) / 0xffff < rate;
}

Redaction is mandatory, not optional. Cookies, Authorization headers, and query parameters frequently carry tokens and PII. Strip or hash them before they reach a log line. The dedicated guide on structured logging for edge functions covers required fields, correlation IDs, sampling tiers, and a reusable redaction allowlist in detail.

Latency budgeting with performance.now()

Every stage consumes part of a finite wall-clock budget. Measure each stage with performance.now() — a high-resolution monotonic clock available in all edge runtimes — and attach the elapsed time to both the span and the log line. This turns “the request felt slow” into a per-stage breakdown you can act on.

const BUDGET_MS = 50;

async function withBudget<T>(label: string, ctx: TraceContext, op: () => Promise<T>): Promise<T> {
  const start = performance.now();
  try {
    return await op();
  } finally {
    const elapsed = performance.now() - start;
    if (elapsed > BUDGET_MS) {
      emit({ level: "warn", traceId: ctx.traceId, stage: label, durationMs: elapsed, msg: "budget exceeded" });
    }
  }
}

Pair budgeting with an AbortController timeout on every outbound fetch, so a slow downstream cannot consume the entire request budget and burn your CPU allowance waiting.

Error boundaries and circuit breakers

Distributed systems fail partially. A non-critical stage — analytics enrichment, a feature-flag lookup — should never take down the request; wrap it in a boundary that records the exception and continues. A critical stage — authentication, routing — should fail closed with an explicit 5xx rather than leaking a stack trace.

When a stage depends on a downstream service, repeated failures should stop you from hammering a struggling backend. A circuit breaker tracks recent failures and, once a threshold is crossed, trips open — short-circuiting to a fallback for a cooldown window — before allowing a probing half-open request to test recovery.

type BreakerState = "closed" | "open" | "half-open";

const breaker = { state: "closed" as BreakerState, failures: 0, openedAt: 0 };
const THRESHOLD = 5;
const COOLDOWN_MS = 30_000;

function canAttempt(now: number): boolean {
  if (breaker.state === "open" && now - breaker.openedAt > COOLDOWN_MS) {
    breaker.state = "half-open";
  }
  return breaker.state !== "open";
}

Because each isolate holds its own copy of breaker in module scope, breaker state is per-isolate, not global. For cross-isolate coordination you need a Durable Object. The full state machine, exponential backoff, and the per-isolate caveats are covered in implementing circuit breakers in edge middleware.

ctx.waitUntil for non-blocking export

The single most important rule of edge observability: telemetry export must never be on the critical path. Returning the Response is what the user waits for; the OTLP POST or log flush should happen after. Every major platform exposes an execution-context method that extends the isolate’s lifetime past the response.

export default {
  async fetch(req: Request, env: Env, ctx: ExecutionContext): Promise<Response> {
    const trace = readTraceContext(req);
    const spans: ReadonlyArray<SpanData> = [];

    const res = await runChain(req, trace);

    // Response returns immediately; export resolves in the background.
    ctx.waitUntil(exportTelemetry(env.OTLP_URL, trace, spans));
    return res;
  },
};

On Cloudflare Workers and Vercel Edge the method is ctx.waitUntil(promise). On Netlify Edge Functions, the request handler stays alive for outstanding promises in a similar fashion. If you forget waitUntil and simply fire the promise, the isolate may be frozen before the export completes and the telemetry is silently lost.

Provider mapping

Each platform exposes telemetry through different primitives. Instrument with portable code, then route the output through the native facility for each provider.

Concern	Cloudflare Workers	Vercel Edge	Netlify Edge Functions
Live log stream	`wrangler tail` / Tail Workers	`vercel logs` / dashboard	`netlify logs:function` / dashboard
Log shipping	Logpush (R2, S3, HTTP)	Log Drains	Log Drains
Native tracing	Workers Trace Events (via Tail Workers)	OpenTelemetry collector / `@vercel/otel`	Forward OTLP via `fetch`
Deferred export hook	`ctx.waitUntil`	`ctx.waitUntil` (`waitUntil` from `@vercel/functions`)	`context.waitUntil`
Runtime	V8 isolate	V8 isolate (Edge Runtime)	Deno
CPU / bundle ceiling	10 ms free / 30 s paid; 1 MB free / 10 MB paid	25 s streaming; 1–4 MB	50 ms soft; 20 MB

Cloudflare’s Tail Workers are especially powerful: a separate Worker receives the trace events and structured logs of your main Worker after each invocation, giving you a place to sample, reshape, and forward telemetry without touching the request path. Vercel’s @vercel/otel package wires the Edge Runtime into an OpenTelemetry collector. Netlify’s Deno runtime has the widest memory headroom (512 MB) but no native tracing product, so you forward OTLP yourself.

Debugging workflow

Move from local reproduction to production tracing in stages:

Local emulation. Run wrangler dev, vercel dev, or netlify dev with structured logging enabled. Replay captured Request objects to confirm span and log emission per stage.
Trace continuity. Send a request with a known traceparent and verify the same trace ID appears on every span and every outbound subrequest header.
Latency attribution. Inspect the per-stage mw.duration_ms attribute to find the stage consuming the most budget.
Alerting. Configure alerts on p95 latency, 5xx rate, and circuit-breaker open events. Sample-keep all errors so alerts always have a trace to drill into.

Common pitfalls

Symptom	Cause	Fix
Telemetry randomly missing	Export promise not registered with `waitUntil`; isolate frozen first	Always wrap exports in `ctx.waitUntil`
Added ~200 ms latency per request	`await`-ing the OTLP POST on the critical path	Defer export; never block the response on it
Broken traces across services	`traceparent` not re-injected into outbound `fetch`	Inject updated `traceparent` on every subrequest
Bundle exceeds 1 MB on Cloudflare	Imported full Node OpenTelemetry SDK	Use `@opentelemetry/api` + a hand-rolled OTLP exporter
Tokens appearing in logs	Logging raw headers/cookies	Apply a redaction allowlist before emit
Circuit breaker never trips globally	Per-isolate module-scope state	Coordinate via a Durable Object for global breaking
`AsyncLocalStorage is not defined`	Relying on Node async context at the edge	Thread context explicitly through the `Context` object

Runtime constraints checklist

Frequently Asked Questions

Why can't I use the full OpenTelemetry Node SDK at the edge?

The Node distribution depends on async_hooks, a background batch processor thread, and packages that pull in Node built-ins, none of which exist in a V8 isolate. It also far exceeds the 1 MB compressed bundle cap on Cloudflare’s free tier. Instrument with the small @opentelemetry/api surface and pair it with a hand-rolled OTLP-over-fetch exporter that you flush through ctx.waitUntil.

How does trace context survive across middleware stages without AsyncLocalStorage?

You pass it explicitly. Edge isolates have no reliable async context storage, so the trace ID and active span travel as fields on the Context object you already thread between stages. On entry you read the traceparent header; on each outbound subrequest you inject an updated traceparent.

Will exporting telemetry slow down my responses?

Only if you await it on the critical path. Register the export promise with ctx.waitUntil so the Response returns immediately and the OTLP POST or log flush completes in the background while the isolate stays alive.

Can a circuit breaker share state across all edge locations?

Not with plain module-scope variables — each isolate has its own copy, so breaking is per-isolate. For a globally consistent breaker you need a single coordination point such as a Durable Object that all isolates consult.

What is the difference between traceparent and tracestate?

traceparent is the standardized core: version, trace ID, parent span ID, and flags. tracestate is an optional companion that carries vendor-specific key-value pairs without breaking interoperability. Always propagate both unchanged except for updating the parent span ID in traceparent.