Node.js Graceful Shutdown: Handle SIGTERM, Drain Requests and Close Connections

Node.js graceful shutdown: how to stop accepting traffic, drain requests, and clean up resources without 502s

Practical guide to implementing graceful shutdown in Node.js: handle SIGTERM/SIGINT, drain connections, return 503 health checks, and clean up DBs and queues.

Node.js services frequently run in orchestrated environments where processes are stopped and restarted routinely. A naive process termination—receiving SIGTERM and exiting immediately—turns in-flight requests into 502s, leaves database connections open, corrupts message-processing state, and creates noisy outages for users and downstream systems. This article explains a practical, production-ready approach to graceful shutdown in Node.js: how to stop accepting new traffic, drain in-flight work with sensible timeouts, and close external resources such as database pools, Redis clients, and message brokers before exiting.

Why graceful shutdown matters for Node.js services

Graceful shutdown is the controlled sequence that a service follows when it is asked to stop. For Node.js applications, that sequence matters because the runtime runs a single-threaded event loop with asynchronous I/O. If you terminate a process immediately, in-flight HTTP requests, queued jobs, or open database transactions can be interrupted midway. The result is bad client UX (502/499 errors), partial writes, stuck locks, and inconsistent metrics. In containerized and cloud environments—Kubernetes, AWS ECS, or a load-balanced fleet—termination happens often (rolling updates, scaling events, drain operations), so implementing a repeatable shutdown pattern protects availability and data integrity.

The basic shutdown pattern for Node.js HTTP servers

A standard shutdown sequence has these steps: stop accepting new connections, inform load balancers and health checks to drain traffic, wait for in-progress work to finish (with a deadline), close connections to external systems, flush logs, and finally exit the process.

Below is an illustrative Node.js pattern (keeps types and APIs generic so you can adapt to Express, Fastify, or plain http servers). This code shows the core flow without relying on framework-specific helpers.

js
// server-shutdown.js
const http = require(‘http’);
let shuttingDown = false;
const server = http.createServer(app);

// track active requests (optional but recommended)
const activeConnections = new Set();
server.on(‘connection’, socket => {
activeConnections.add(socket);
socket.on(‘close’, () => activeConnections.delete(socket));
});

async function closeExternalResources() {
// close DB pools, Redis clients, message broker consumers, flush logs, etc.
await Promise.all([closeDbPool(), closeRedis(), closeMessageClient(), flushLogs()]);
}

function forceExit(code = 1) {
// destroy lingering sockets to avoid hanging forever
activeConnections.forEach(s => {
try { s.destroy(); } catch (e) {}
});
process.exit(code);
}

async function gracefulShutdown(signal) {
if (shuttingDown) return;
shuttingDown = true;
console.info(Received ${signal}, starting graceful shutdown);

// stop accepting new TCP connections
server.close(err => {
if (err) console.error(‘Error closing server’, err);
});

// start a hard timeout to force exit if cleanup hangs
const hardTimeout = setTimeout(() => {
console.error(‘Shutdown timed out, forcing exit’);
forceExit(1);
}, 30_000); // 30s, tune per app

// give existing requests time to finish
try {
await closeExternalResources();
clearTimeout(hardTimeout);
console.info(‘Cleanup complete, exiting’);
process.exit(0);
} catch (err) {
console.error(‘Error during shutdown’, err);
clearTimeout(hardTimeout);
forceExit(1);
}
}

process.on(‘SIGTERM’, () => gracefulShutdown(‘SIGTERM’));
process.on(‘SIGINT’, () => gracefulShutdown(‘SIGINT’));

Key points in this pattern:

server.close() stops accepting new connections but allows existing requests to complete.
Track socket objects so you can forcibly destroy idle keep-alive connections after your graceful window.
Use a hard timeout to avoid indefinite hangs—process managers (Docker, systemd, Kubernetes) will escalate to SIGKILL if your process doesn’t exit within their configured grace period.
Close external resources (databases, caches, queues) in parallel when safe; await their completion before exiting.

Health check behavior and load balancer coordination

Returning 503 from health endpoints during shutdown is an effective signal to upstream load balancers and service registries that they should stop routing new requests to this instance. Add a lightweight health route that responds differently when the application is shutting down:

When not shutting down: return 200 OK and basic dependency checks as needed.
When shutting down: return 503 immediately to cause load balancers to stop sending new traffic.

Example logic (middleware or dedicated route):

If shuttingDown flag is true, respond 503 with body { status: "shutting_down" }.
Otherwise, perform fast checks (database reachable, minimal latency) and return 200.

In Kubernetes, readiness probes are the right mechanism to remove pods from service. Toggling your readiness probe to fail before initiating shutdown makes the pod stop getting new traffic. Note that Kubernetes will still send SIGTERM to the container; readiness should fail quickly so the controller drains endpoints and clients stop sending new requests before you begin long cleanup work.

For external load balancers (AWS ALB/NLB, GCP), ensure they honor health check responses and connection draining policies. Some platforms also support connection draining at the load balancer level; combine that with your application returning 503 to ensure a smooth transition.

Cleaning up external resources: databases, caches, and message systems

A graceful shutdown is more than closing the HTTP listener. Common cleanup tasks include:

Close database connection pools: allow active queries to finish and then call pool.end()/close() to release sockets.
Disconnect from Redis or Memcached: ensure any pending transactions or pub/sub subscriptions are gracefully closed.
Stop message consumer loops (RabbitMQ, Kafka, SQS): stop accepting new messages and finish processing in-flight messages before acknowledging or committing offsets.
Flush buffered logs: ensure all log entries are written to disk or sent to a remote sink.
Deregister from service discovery (Consul, Eureka): remove the instance so clients don’t attempt to contact it after it is down.

Order matters. For example, stop accepting new messages from a queue before you close database connections required to finish processing them. If you close the DB pool too early, message handlers may fail while trying to persist results.

Handling in-flight requests and connection quirks

server.close() waits until all open connections are closed, but several things can keep a connection alive unexpectedly:

HTTP keep-alive sockets: clients that reuse the connection may hold it open even with no active request.
Long-polling, WebSocket, or streaming responses: these are intentionally long-lived and must be closed explicitly.
Slow clients that stop reading, causing the server to wait at socket level.

To manage these cases:

Track active sockets and requests. Mark sockets as idle or active and destroy idle ones after your shutdown timeout.
For WebSocket or long-lived streams, implement application-level shutdown logic to send a close frame or end the stream.
Enforce per-request timeouts to avoid runaway handlers during normal operation and shutdown.

A common pattern is: set the shuttingDown flag, stop accepting new connections, start a grace period during which you allow requests to complete, then forcibly destroy remaining sockets when the grace period ends.

Signals, process managers, and platform-specific details

Different environments can send different signals or have different expectations:

Docker: docker stop sends SIGTERM and then SIGKILL after the stop timeout (default 10s). Increase the timeout for containers that need longer to shut down.
systemd: systemd can be configured with TimeoutStopSec and KillSignal; coordinate your service file so processes have enough time to clean up.
Kubernetes: upon deletion or rolling update, kubelet sends SIGTERM and waits terminationGracePeriodSeconds (default 30s) before SIGKILL. Kubernetes also manages endpoint removal; ensure your readiness probe fails early or use preStop hooks to coordinate.
Windows: signals like SIGTERM and SIGINT behave differently; use platform-appropriate graceful handling or process manager features.

Also mind process managers like PM2 or nodemon: they may wrap signals and require specific configuration to let your app receive signals directly.

Testing shutdown behavior and observability

Testing graceful shutdown is essential. Suggested approaches:

Local testing: send SIGTERM to the process (kill -TERM ) while running a load generator that keeps several long requests in flight. Observe that new requests get 503 or failfast while in-flight requests complete.
Container testing: run the container with a shorter application timeout and use docker stop to ensure the container gets SIGTERM and then SIGKILL.
Kubernetes: kubectl delete pod or use kubectl rollout restart to observe real orchestration behavior. Adjust readiness probes and terminationGracePeriodSeconds as needed.
Simulate resource failures: ensure that if a dependent database is slow to close, your application logs the issue and eventually forces exit, so orchestrators can continue with replacement pods.

Add observability hooks:

Emit shutdown lifecycle events to logs and monitoring systems (metrics for shutdown started/completed).
Track the number of active requests and sockets via Prometheus metrics or your telemetry system.
Expose a readiness/health check endpoint and log when it switches to unhealthy.

Patterns for different deployment models

The details vary depending on where you run Node.js:

Kubernetes: toggle readiness probe to fail before starting cleanup, use terminationGracePeriodSeconds to allow for draining, and consider a preStop hook for synchronous deregistration from service discovery.
AWS ECS / EC2 instances behind an ALB: ensure target group deregistration and connection-draining settings coincide with your app’s shutdown window.
Serverless: many serverless runtimes (AWS Lambda) have short-lived containers and different lifecycle semantics—design idempotent handlers and avoid relying on long shutdown windows.
Traditional VMs with systemd: set TimeoutStopSec and use ExecStop to coordinate cleanup steps if needed.

Choose values for your grace period and hard timeout based on typical request latency, queue processing time, and recovery expectations. For user-facing HTTP endpoints, a shorter, well-defined window (tens of seconds) usually works. For background processors that must finish long-running jobs, you may need orchestrator-level controls (drain before terminating, migrate work) or architecture changes (move long jobs to durable queues with at-least-once semantics).

Developer and business implications

Implementing graceful shutdown affects teams and product outcomes in multiple ways:

Reliability: fewer user-visible errors during deploys and scale events reduces support load and improves SLAs.
Operational predictability: processes that exit cleanly produce clearer metrics and fewer partial failures in downstream systems.
Developer ergonomics: standard shutdown hooks become part of the codebase lifecycle and simplify debugging and local testing.
Deployment strategy: faster, safer rollouts become possible when services can drain gracefully; this can reduce the need for conservative deployment windows or maintenance windows.

From a business perspective, graceful termination lowers the risk of data corruption, accidental double-processing, and customer-visible downtime—consequences that compound over repeated, automated deploys.

Common pitfalls and how to avoid them

Forgetting to fail readiness: If the app doesn’t tell orchestrators it’s no longer ready, the load balancer will keep sending traffic and the app will continue to accept requests even during shutdown.
Closing external resources too early: If you close a DB pool before finishing message processing, handlers will fail.
Relying on server.close() alone: long-lived keep-alive sockets can keep the process alive indefinitely; track and close sockets explicitly.
Hard timeouts set too low: if the grace period is shorter than typical requests, you’ll still generate errors.
Not handling errors during shutdown: log and handle exceptions during cleanup so the process exits cleanly with an informative status.

Implementation checklist for production readiness

Add a shuttingDown flag and expose a health/readiness endpoint that returns 503 when set.
Call server.close() to stop accepting new TCP connections.
Track sockets and requests to forcibly remove lingering connections after the grace period.
Stop message consumers and close database/Redis connections after they finish work.
Flush log buffers and wait for completion if your logging sink is asynchronous.
Use a hard timeout slightly shorter than the orchestrator’s kill timeout.
Test in staging with real traffic patterns and replicates of production dependencies.
Instrument shutdown steps with logs and metrics for postmortems.

Graceful shutdown is an operational discipline—combine code-level hooks with orchestration settings (readiness probes, terminationGracePeriodSeconds, docker stop timeouts) to make rollouts and autoscaling safe and predictable.

Looking ahead, runtime and framework ecosystems are moving toward more opinionated lifecycle hooks and better tooling to manage termination semantics. Expect frameworks and service meshes to offer richer built-in drain and deregistration primitives, and for observability tooling to provide more automated checks for shutdown-related regressions. In the meantime, applying the patterns described here—coordinated readiness toggling, tracking and terminating lingering sockets, orderly resource cleanup, and robust testing—will make Node.js services far more resilient during routine maintenance and unexpected restarts.