Torus: How a Node.js Layer 7 Edge Gateway Avoids V8 Heap Pitfalls and Silent Socket Leaks
Torus is a Node.js Layer 7 edge gateway that uses stream.pipeline and zero-heap Buffer handling to prevent socket leaks and sustain high-throughput proxying.
Why Torus forced a rethink of standard request handling
When the Torus team began implementing a multi-core Layer 7 API gateway in Node.js, the initial request-handling code looked familiar: accumulate incoming data chunks into a JavaScript string or buffer, then forward the assembled payload to the backend. That approach is common in web apps, but at proxy scale it behaves disastrously. Torus’s early tests under small loads passed with no drama, yet when concurrent connections and large uploads arrived, the Node process buckled—CPU pegged, event loop latency spiked, and memory climbed until garbage collection pauses or outright crashes occurred. The root cause was simple but easy to miss: pulling raw TCP payloads into V8-managed memory forces the garbage collector to intervene, and GC pauses are fatal for a single-threaded event loop tasked with moving bytes between sockets.
Addressing this problem required shifting from “read into the heap” to “move bytes outside the heap.” That meant exposing the network stream as a flow of native Buffers that live in C++/OS memory, and wiring them through to the backend without copying into V8. Doing so keeps the event loop responsive under sustained load, reduces GC pressure, and lets the gateway scale to handle much larger payloads.
How Node.js Buffers and piping reduce heap pressure
Node.js provides primitives specifically intended for streaming large data without dragging it into the JavaScript heap. Buffer objects allocate memory outside V8 in native space; the node:stream module passes Buffer chunks directly to I/O operations. At a conceptual level, Torus stopped treating payloads as static variables and started treating them as flowing water: a small, transient buffer is filled, flushed to the destination, and then reused.
The native .pipe() API is attractive because it wires a readable stream directly to a writable stream in one line and honors backpressure: a fast client cannot overwhelm a slow backend because the node streams mechanism will apply backpressure automatically. For many workloads this approach is sufficient and far superior to assembling entire payloads in JavaScript before forwarding. In Torus’s case, switching to streaming pipes dramatically reduced CPU usage and flattened memory consumption even for very large transfers.
The hidden danger: why .pipe() can produce half-open sockets
The improvement seemed complete until integration tests began to hang. Jest reported that the process wouldn’t exit, even though all assertions had passed and teardown hooks ran. At the OS level, the Node process remained alive because an active I/O handle—specifically a socket—persisted in the event loop. The surprising culprit was the very primitive that had helped Torus avoid the heap: .pipe().
.node’s .pipe() is focused on data transfer, not lifecycle orchestration. When a client disconnects abruptly, .pipe() can continue to leave the destination socket open. In practice this creates half-open (or “half-closed”) sockets: the client has gone away, but the backend connection remains, awaiting bytes that will never come. Dev environments hide this problem during short runs, but at scale this behavior quietly consumes file descriptors (FDs) until the OS limit is reached and the process crashes with EMFILE. It’s an especially pernicious class of bug because logs and tests may show successful requests while the underlying process slowly degrades.
How stream.pipeline enforces correct lifecycle semantics
Node’s stream.pipeline was introduced to address exactly this class of problems. Unlike .pipe(), pipeline treats a chain of streams as a single state machine. When any part of the pipeline emits an error, closes, or fails unexpectedly, pipeline tears down the entire chain and provides a single error callback or promise rejection. This guarantees that neither end of a bidirectional flow is left dangling.
For bidirectional TCP proxying—where data flows both client→backend and backend→client—Torus replaced ad-hoc pipe wiring with two coordinated pipelines that are raced together. If either direction fails, pipeline ensures both sockets are destroyed and the proxy’s resources are released immediately. In practice this removed the silent socket leak that had been preventing test processes from terminating and eliminated long-lived, dead FDs in production runs.
Concurrency, backpressure, and multi-core routing in Torus
Torus isn’t just a single-process proxy; it is designed to utilize multiple CPU cores for throughput and isolation. Streaming without copying is necessary but not sufficient for high concurrency: the gateway also needs to manage backpressure and distribute connections across worker processes or threads.
By keeping chunks as OS-managed Buffers and wiring them with pipeline, Torus preserves native backpressure signals. Those signals are then surfaced to the process-level scheduler (cluster workers, worker threads, or an external load balancer) so the system reacts to slow backends by slowing reads from clients. The combination of non-heap Buffers and correct backpressure semantics lets Torus sustain many concurrent large transfers without ballooning the V8 heap or triggering GC stalls.
On multi-core systems, Torus uses a routing pool to balance backend connections and prevent head-of-line blocking. The routing layer treats streams as opaque conduits of bytes: connections are assigned, sockets are proxied bidirectionally, and the monitoring layer observes socket lifecycle events rather than content. This separation of concerns—routing, lifecycle, observability—reduces complexity and makes it easier to reason about resource utilization at scale.
Testing, observability, and preventing FD exhaustion
The Torus development experience highlights an important operational lesson: integration suites must exercise teardown and failure scenarios—not just happy paths. The earlier tests passed functional assertions but missed the process-level leak because client disconnect cases weren’t simulated aggressively. To surface these issues, tests should:
- Simulate abrupt TCP disconnects (ECONNRESET) to validate pipeline cleanup.
- Exercise slow backend behavior to confirm backpressure propagates.
- Monitor FD usage under load to detect leaks early.
- Assert that test runners and CI workers terminate cleanly after teardown.
Observability adds another layer of protection. Torus instruments socket lifecycle events (open, close, error, timeout) and exposes FD metrics, active-socket counts, and per-worker memory/G C stats. These signals enable alerting before OS limits are hit and provide actionable telemetry for debugging. Phrases like proxy observability, streaming telemetry, and FD monitoring are natural internal link targets for teams operating similar infrastructure.
Where Torus fits in the ecosystem and common deployment models
Torus occupies a niche for teams that need programmable, lightweight Layer 7 edge routing implemented in JavaScript but requiring production-grade reliability and throughput. It’s not a one-size-fits-all replacement for established proxies like NGINX, Envoy, or HAProxy; rather, Torus is suited to environments where ease of integrating custom JavaScript logic, rapid iteration, or tight Node.js ecosystem integrations matter.
Typical use cases include:
- API gateways that perform lightweight routing, authentication hooks, or request transformations.
- Edge proxies for microservices architectures where developers want programmable hooks without an external control plane.
- Developer-facing local proxies and testing harnesses that need to match production-level socket behavior.
- Integrations with observability and automation pipelines where Node-based adapters simplify tooling.
In production, Torus is commonly paired with security tooling (WAFs, TLS termination), API management layers (rate limiting, quotas), and automation platforms (CI/CD pipelines that deploy routing rules). It also plays well with developer tools and observability stacks: logs, tracing, and metrics collection integrate naturally with Node.js ecosystems.
Developer implications: writing robust networked Node.js code
Torus’s evolution surfaces several best practices for engineers building networked systems in Node.js:
- Treat large payloads as streams and keep them out of the V8 heap whenever possible.
- Prefer APIs that enforce lifecycle semantics over convenience shorthands when building infrastructure: pipeline > pipe for production proxies.
- Design tests to simulate adverse network conditions (disconnects, resets, slow peers) and assert that the process exits cleanly.
- Monitor low-level OS resources (FDs, sockets, ephemeral ports) in addition to application metrics.
- Separate routing logic from payload processing to minimize the surface area that interacts with large data.
In addition to these engineering principles, building reliable infrastructure requires attention to deployments: rolling restarts, graceful shutdown protocols, readiness probes, and resource limits should be configured to account for socket teardown semantics. For Node-based gateways, graceful shutdown must wait for pipeline completion or forcibly destroy sockets after a reasonable timeout to prevent indefinite hangs.
Security, automation, and ecosystem integration
A robust edge gateway needs more than fast byte plumbing. Torus integrates with security and automation ecosystems to provide a complete operational solution. Typical integrations include:
- TLS termination and certificate automation (ACME clients or managed certificate services).
- Web application firewalls and DDoS mitigation services that apply policies at Layer 7.
- Authentication and identity providers for API access control, including OAuth flows and JWT validation.
- CI/CD automation that deploys routing rules and policy updates as code.
- Observability integrations—APM, tracing, and logging—that provide end-to-end visibility into request paths.
There’s also an emerging intersection with AI tools and automation platforms. For example, teams building intelligent traffic routing or anomaly detection can feed Torus metrics into machine learning pipelines to create policy-driven rerouting or adaptive rate limiting. Likewise, CRM and marketing platforms that rely on webhooks benefit from resilient edge proxies that can buffer, retry, and transform events without risking FD leaks or GC interruptions.
Operational patterns: graceful shutdowns and backpressure-aware scaling
To operate a proxy reliably, you must coordinate graceful termination at multiple levels. Torus adopts these operational patterns:
- On shutdown, stop accepting new connections immediately and let active pipelines drain within a configurable timeout.
- Use pipeline promises to detect when all active streams are settled; after the drain period, perform forced socket destruction to avoid lingering handles.
- Employ load shedding when system-wide FD or memory pressure rises, using graceful reject responses rather than allowing uncontrolled resource depletion.
- Autoscale worker processes based on low-level indicators like active sockets per worker and per-worker memory/G C behavior rather than only high-level request rates.
These practices reduce the risk of EMFILE-style crashes and make scaling decisions predictable. They also dovetail with container orchestration patterns: readiness and liveness probes should reflect pipeline draining behavior so orchestrators don’t prematurely kill processes still finishing active flows.
Broader implications for infrastructure and developer tooling
Torus’s experience illustrates a broader truth: high-level language ecosystems can host infrastructure, but developers must understand the platform’s runtime model deeply. The convenience of .pipe() or similar shorthands in tutorials can obscure lifecycle semantics that matter at scale. As more infrastructure is implemented in managed runtimes—Node.js, Python, JavaScript-based control planes—operators and platform engineers need to balance developer productivity against the hidden costs of runtime-managed heaps, GC pauses, and incomplete lifecycle propagation.
This has implications for vendor and open-source choices: teams evaluating gateways should weigh not just feature sets but how those tools handle resource boundaries, backpressure, and lifecycle events. Tooling for testing network failure modes, FD leakage, and GC behavior becomes part of the standard ops toolbox. In the long run, the community will benefit from clearer documentation, safer defaults (like pipeline), and more robust test suites that simulate the kinds of failures proxies encounter in production.
Torus is an example of how platform-level awareness—understanding the V8 heap, buffer allocation semantics, and OS file descriptors—translates directly into operational reliability. Developers building on Node.js should expect that production-level solutions require more than idiomatic code snippets; they require system-level thinking.
Torus’s story also nudges platform teams toward richer integration points: automated policies for connection teardown, standardized backpressure metrics for autoscaling, and better runtime primitives that make safe proxying the default. When the platform and libraries provide clearer contracts around lifecycle and resource ownership, application developers can focus on business logic without risking silent resource exhaustion.
Looking ahead, the path Torus followed points to incremental improvements in Node.js networking ergonomics and ecosystem tooling. Expect better libraries and higher-level frameworks that encapsulate safe streaming patterns, greater adoption of pipeline-style APIs in examples and tutorials, and more sophisticated telemetry standards for socket-level metrics. For organizations, the lesson is to instrument aggressively, test failure modes in CI, and prefer lifecycle-aware mechanisms when building infrastructure that must run continuously under unpredictable network conditions.
















