Instrumenting Edge Middleware with OpenTelemetry

You add the standard OpenTelemetry Node SDK to your edge middleware, deploy, and the build fails — or worse, it succeeds and the isolate throws AsyncLocalStorage is not defined at runtime, or your compressed bundle blows past the 1 MB cap. The batteries-included distribution assumes a long-lived Node process with a background exporter thread and async-hooks context propagation. None of that exists inside a V8 isolate.

This guide is part of Observability and Debugging Edge Middleware. It shows the edge-safe path: instrument with the tiny @opentelemetry/api surface, build a minimal OTLP-over-fetch exporter, and flush it inside ctx.waitUntil so tracing never blocks a response.

Root cause: why the Node SDK does not fit the edge

Three constraints rule out the auto-instrumenting Node distribution on platforms like Cloudflare Workers, where the edge runtime exposes only Web APIs:

  • No async context. The SDK relies on AsyncLocalStorage / async_hooks to flow the active span across awaits. V8 isolates do not provide this reliably, so implicit context.with() propagation cannot be trusted. You thread context explicitly.
  • No background exporter. The BatchSpanProcessor schedules flushes on a timer in a long-lived process. An isolate may be frozen the instant the response resolves, so you must flush deliberately through ctx.waitUntil.
  • Bundle budget. The Node SDK plus auto-instrumentation runs into megabytes, exceeding the 1 MB compressed cap on Cloudflare’s free tier and straining Vercel Edge’s 1–4 MB budget. @opentelemetry/api alone is a few kilobytes.

The fix is to keep the OpenTelemetry API (spans, attributes, status) and replace the SDK machinery with edge-native primitives.

Step 1: Install the API surface only

Add just the API package. Do not add @opentelemetry/sdk-node or any auto-instrumentation.

npm install @opentelemetry/api

Step 2: Define an edge-safe span collector

Instead of a TracerProvider with a batch processor, collect finished spans into an array on the request context. Each span is a plain serializable record.

// otel-edge.ts
export interface EdgeSpan {
  name: string;
  traceId: string;
  spanId: string;
  parentSpanId?: string;
  startNs: number;
  endNs: number;
  status: "ok" | "error";
  attributes: Record<string, string | number | boolean>;
}

export class SpanCollector {
  readonly spans: EdgeSpan[] = [];
  constructor(public readonly traceId: string) {}

  add(span: EdgeSpan): void {
    this.spans.push(span);
  }
}

function hex(bytes: number): string {
  const buf = new Uint8Array(bytes);
  crypto.getRandomValues(buf);
  return [...buf].map((b) => b.toString(16).padStart(2, "0")).join("");
}

export function newSpanId(): string {
  return hex(8);
}

Step 3: Wrap each middleware stage in a span

Open a span on entry, attach request attributes, and close it in finally so both success and error paths are recorded. Pass the collector and the parent span ID through the Context object you already thread between stages.

import { SpanStatusCode } from "@opentelemetry/api";
import { SpanCollector, newSpanId, type EdgeSpan } from "./otel-edge";

type Middleware = (
  req: Request,
  ctx: { collector: SpanCollector; parentSpanId: string },
  next: () => Promise<Response>,
) => Promise<Response>;

export function instrument(name: string, mw: Middleware): Middleware {
  return async (req, ctx, next) => {
    const spanId = newSpanId();
    const startNs = performance.now() * 1e6;
    const attributes: EdgeSpan["attributes"] = {
      "http.request.method": req.method,
      "url.path": new URL(req.url).pathname,
    };
    let status: EdgeSpan["status"] = "ok";

    const childCtx = { ...ctx, parentSpanId: spanId };
    try {
      const res = await mw(req, childCtx, next);
      attributes["http.response.status_code"] = res.status;
      if (res.status >= 500) status = "error";
      return res;
    } catch (err) {
      status = "error";
      attributes["exception.message"] = String(err);
      throw err;
    } finally {
      ctx.collector.add({
        name: `mw.${name}`,
        traceId: ctx.collector.traceId,
        spanId,
        parentSpanId: ctx.parentSpanId,
        startNs,
        endNs: performance.now() * 1e6,
        status,
        attributes,
      });
    }
  };
}

Because the active span ID rides on childCtx.parentSpanId, each nested stage records the correct parent without any async-context magic.

Step 4: Build a minimal OTLP-over-fetch exporter

The OTLP/HTTP protocol accepts JSON. Translate your collected spans into the OTLP trace payload and POST it with fetch. Convert the SpanStatus enum and nanosecond timestamps to the wire format.

// otlp-exporter.ts
import type { EdgeSpan } from "./otel-edge";

export async function exportSpans(endpoint: string, headers: Record<string, string>, spans: EdgeSpan[]): Promise<void> {
  if (spans.length === 0) return;

  const otlpSpans = spans.map((s) => ({
    traceId: s.traceId,
    spanId: s.spanId,
    parentSpanId: s.parentSpanId,
    name: s.name,
    kind: 2, // SERVER
    startTimeUnixNano: String(Math.round(s.startNs)),
    endTimeUnixNano: String(Math.round(s.endNs)),
    status: { code: s.status === "error" ? 2 : 1 },
    attributes: Object.entries(s.attributes).map(([key, value]) => ({
      key,
      value: typeof value === "number"
        ? { intValue: String(value) }
        : typeof value === "boolean"
        ? { boolValue: value }
        : { stringValue: String(value) },
    })),
  }));

  const body = JSON.stringify({
    resourceSpans: [{
      resource: { attributes: [{ key: "service.name", value: { stringValue: "edge-middleware" } }] },
      scopeSpans: [{ scope: { name: "edge-middleware" }, spans: otlpSpans }],
    }],
  });

  await fetch(`${endpoint}/v1/traces`, {
    method: "POST",
    headers: { "content-type": "application/json", ...headers },
    body,
  });
}

Step 5: Flush inside ctx.waitUntil

Run the chain, return the response, and hand the export promise to ctx.waitUntil. The isolate stays alive until the POST resolves, but the user never waits for it.

import { SpanCollector } from "./otel-edge";
import { exportSpans } from "./otlp-exporter";

interface Env {
  OTLP_ENDPOINT: string;
  OTLP_API_KEY: string;
}

export default {
  async fetch(req: Request, env: Env, ctx: ExecutionContext): Promise<Response> {
    const traceparent = req.headers.get("traceparent");
    const traceId = traceparent?.split("-")[1] ?? crypto.randomUUID().replace(/-/g, "");
    const collector = new SpanCollector(traceId);

    const res = await runChain(req, { collector, parentSpanId: traceparent?.split("-")[2] ?? "" });

    ctx.waitUntil(
      exportSpans(env.OTLP_ENDPOINT, { "x-api-key": env.OTLP_API_KEY }, collector.spans),
    );
    return res;
  },
};

Configuration

On Cloudflare, declare the endpoint and key as secrets in wrangler.jsonc and set them with wrangler secret put:

{
  "name": "edge-middleware",
  "main": "src/index.ts",
  "compatibility_date": "2026-06-01",
  "vars": { "OTLP_ENDPOINT": "https://otel-collector.example.com" }
}

On Vercel Edge, set OTLP_ENDPOINT and OTLP_API_KEY as Environment Variables and import waitUntil from @vercel/functions. On Netlify Edge Functions, the Deno runtime exposes context.waitUntil; the exporter code is identical.

Local vs production divergence

Aspect Local (wrangler dev) Production
ctx.waitUntil Resolves but may flush before isolate teardown Genuine post-response background flush
OTLP endpoint Often a local collector or stdout Hosted collector behind auth
Trace continuity You inject traceparent by hand Upstream proxy supplies it
Sampling Keep everything for inspection Head-based sampling on
Clock resolution performance.now() full precision May be coarsened to mitigate timing attacks

Note that some edge runtimes deliberately coarsen performance.now() in production to blunt timing side-channels, so treat span durations as approximate, not microbenchmark-grade.

Step 6: Validate with Vitest

Use a fake ExecutionContext whose waitUntil collects promises you can await, and assert that one span per stage was produced with the right parent linkage.

import { describe, it, expect, vi } from "vitest";
import { SpanCollector } from "../src/otel-edge";
import { instrument } from "../src/instrument";

function fakeCtx() {
  const promises: Promise<unknown>[] = [];
  return { waitUntil: (p: Promise<unknown>) => promises.push(p), promises };
}

describe("instrument", () => {
  it("records one span per stage with parent linkage", async () => {
    const collector = new SpanCollector("a".repeat(32));
    const inner = instrument("inner", async () => new Response("ok"));
    const outer = instrument("outer", async (req, ctx, next) => inner(req, ctx, next));

    await outer(new Request("https://x.test/p"), { collector, parentSpanId: "" }, async () => new Response());

    expect(collector.spans).toHaveLength(2);
    const [innerSpan, outerSpan] = collector.spans;
    expect(innerSpan.parentSpanId).toBe(outerSpan.spanId);
    expect(outerSpan.attributes["url.path"]).toBe("/p");
  });

  it("marks the span as error when the stage throws", async () => {
    const collector = new SpanCollector("b".repeat(32));
    const boom = instrument("boom", async () => { throw new Error("fail"); });
    await expect(
      boom(new Request("https://x.test/"), { collector, parentSpanId: "" }, async () => new Response()),
    ).rejects.toThrow("fail");
    expect(collector.spans[0].status).toBe("error");
  });
});

Pitfalls

  • Importing the Node SDK. @opentelemetry/sdk-node and auto-instrumentation pull in Node built-ins and break the build. Use @opentelemetry/api only.
  • Awaiting the export. await exportSpans(...) adds the collector round-trip to every response. Always defer with ctx.waitUntil.
  • Relying on async context. context.with() will not flow the active span reliably at the edge. Pass the parent span ID explicitly through your Context object.
  • Unbounded span arrays. A chain that fans out can accumulate thousands of spans and blow the memory budget. Cap collector size and sample.
  • Dropping trace continuity. If you mint a fresh trace ID even when a valid traceparent arrives, your edge spans detach from the upstream trace. Adopt the incoming trace ID when present.

Production deployment checklist

  • Only @opentelemetry/api
  • Incoming traceparent
  • Export runs in ctx.waitUntil

Frequently Asked Questions

Can I use @vercel/otel or a hosted edge OTel package instead?

Yes — on Vercel, @vercel/otel wires the Edge Runtime into an OpenTelemetry collector and is the path of least resistance there. The hand-rolled approach in this guide is provider-agnostic and keeps the bundle minimal, which matters most on Cloudflare’s 1 MB free tier. Pick the hosted package when you are Vercel-only and the bundle budget allows it.

Why pass the parent span ID explicitly instead of using context.with()?

context.with() depends on async context propagation through AsyncLocalStorage, which is unreliable in V8 isolates. Threading the parent span ID on your own Context object guarantees correct parent-child linkage regardless of how many awaits sit between stages.

How big can the span collector grow before it is a problem?

Each span is a small object, but a chain that fans out across many subrequests can accumulate thousands and pressure the 128 MB memory ceiling. Cap the collector at a sane bound (for example a few hundred spans) and apply sampling so high-volume routes do not export everything.

Does OTLP require gRPC at the edge?

No. Edge runtimes cannot open raw gRPC sockets, but OTLP/HTTP accepts a JSON payload over a normal fetch POST to the /v1/traces endpoint, which is exactly what this exporter sends.