Skip to main content

Nitro memory management

Although Nitro is a Go application, it can use significantly more memory than Go's runtime reports. This is because Nitro relies on multiple allocators: the Go garbage-collected heap, CGO (Go's mechanism for calling C code) allocations via calloc, and direct mmap system calls, each with its own accounting. Understanding where memory lives and which configuration knobs control it is essential for sizing containers, setting GOMEMLIMIT, and avoiding out-of-memory (OOM) kills. Nitro also includes built-in runtime memory protection settings. These can throttle RPC requests or pause block validation when free memory runs low, providing a safety net against OOM kills.

Memory allocators in Nitro

Nitro's total resident memory (RSS) is the sum of four distinct categories:

AllocatorWhat uses itVisible in Go memstats?Controlled by
Go heapState trie (dirty), transaction processing, goroutine stacks, general application dataYesGOMEMLIMIT, trie-dirty-cache
callocPebble block cache, Pebble memtables, Stylus WASM cacheNodatabase-cache, stylus-lru-cache-capacity
mmapfastcache (trie-clean and snapshot caches)Notrie-clean-cache, snapshot-cache
glibc malloc arenasPer-thread arena overhead for CGO allocationsNoMALLOC_ARENA_MAX

Only the Go heap is subject to Go's garbage collector and GOMEMLIMIT. The CGO and mmap allocations are invisible to Go's runtime. They don't appear in runtime.MemStats or standard Go memory profiles, but they still consume container memory and count toward your memory limit.

Go heap

The Go runtime manages its own heap for all pure-Go allocations. Key consumers include:

  • Dirty trie cache (trie-dirty-cache): Modified state trie nodes held in memory before being flushed to disk. Defaults to 1024 MB and is one of the largest bounded caches on the Go heap.
  • Contract code cache: An LRU cache of contract bytecode, hardcoded at 256 MB. Isn't configurable.
  • Activated WASM cache: Compiled Stylus WASM modules cached on the Go heap, hardcoded at 64 MB.
  • fastcache index maps: Although fastcache stores its data via mmap, each instance maintains a Go-side index (bucket maps of uint64 to uint64). With two large fastcache instances (trie-clean and snapshot), this index metadata can consume hundreds of MB on the Go heap.
  • Snapshot diff layers: Up to 128 diff layers can accumulate, each holding Go maps of modified accounts and storage slots.
  • Goroutine stacks, block/receipt caches, and GC overhead: Goroutine stacks, recently accessed blocks/receipts, and Go's own GC metadata collectively add further pressure.

Go reports its total memory usage via runtime.MemStats.Sys, which includes the heap, stack space, and GC metadata. This is the portion of memory that GOMEMLIMIT governs.

CGO allocations (Pebble and Stylus)

Nitro's on-disk database, Pebble, allocates its block cache and memtables through CGO calloc() calls (see pebble/internal/manual/manual.go in the source). These allocations go through the C memory allocator and are out of scope for Go's memory tracking.

Pebble block cache is the largest CGO consumer. It caches frequently read database blocks in memory to avoid disk I/O. Its size is set directly by the database-cache configuration parameter.

Pebble memtables buffer recent writes before they are flushed to disk. Nitro configures four memtables, each sized at database-cache / 8, for a combined maximum of database-cache / 2. For the default database-cache of 2048 MB, this means up to 1024 MB of memtable space (four memtables of 256 MB each).

Stylus WASM cache stores compiled WebAssembly modules for Stylus smart contracts. Rust allocates this cache (invoked through CGO), and stylus-lru-cache-capacity bounds its size.

Raw mmap allocations (fastcache)

Two caches use fastcache, a library that allocates memory via direct mmap system calls, bypassing both Go's allocator and CGO:

  • Trie-clean cache (trie-clean-cache): Caches unchanged state trie nodes. Default: 600 MB.
  • Snapshot cache (snapshot-cache): Caches state snapshot data for fast reads. Default: 400 MB.

Because fastcache uses raw mmap, this memory doesn't appear in Go's memstats or standard profiling tools. You can only see it by inspecting /proc/<pid>/smaps at the OS level. Each fastcache instance allocates memory in 64 MB chunks, making these regions identifiable when analyzing process memory maps.

glibc malloc arenas

When Nitro makes CGO calls (for Pebble, Stylus, etc.), the resulting C-side allocations go through the system's default C memory allocator: glibc malloc. Unlike Go's garbage-collected heap, malloc manages memory by requesting large regions from the OS and subdividing them to satisfy individual allocation requests. Freed memory is returned to the allocator's internal free lists rather than immediately back to the OS, so the process's RSS can remain elevated even after allocations are freed.

To handle concurrent allocations efficiently, glibc malloc uses arenas, which are independent memory pools, each with its own lock. When a thread allocates memory, it picks an arena, reducing contention compared to a single global lock. By default, glibc creates up to 8 × CPU_count arenas, each reserving a 64 MB region. The worst-case overhead for arenas is:

Arena overhead = 8 × CPU_count × 64 MB

In containerized environments, glibc detects the underlying host CPU count (not the container's CPU requests), which often results in far more arenas than needed. As the process runs and more threads make CGO calls, glibc creates and retains new arenas, causing RSS to drift upward over days or weeks even though no individual allocation is leaking.

This can be controlled with the MALLOC_ARENA_MAX environment variable:

MALLOC_ARENA_MAX=2

Setting MALLOC_ARENA_MAX=2 caps glibc to two arenas, reducing worst-case arena overhead from gigabytes to ~128 MB. In testing, this eliminated the slow memory growth with no measurable performance impact on RPC throughput.

caution

Without MALLOC_ARENA_MAX, a Nitro node on a large host can accumulate gigabytes of arena overhead that appears as a "memory leak" because RSS grows steadily while Go reports stable usage. This is the most common cause of unexplained memory growth in long-running Nitro nodes.

Thread stacks

Nitro spawns native threads for CGO operations (Pebble, compression libraries) and Stylus execution.

Cache configuration reference

All cache sizes are configured under execution.caching:

ParameterDefaultAllocatorDescription
database-cache2048 MBCGO (calloc)Pebble block cache size. Also determines memtable sizes.
trie-dirty-cache1024 MBGo heapModified trie nodes awaiting flush to disk.
trie-clean-cache600 MBmmap (fastcache)Unchanged trie nodes cached for read performance.
snapshot-cache400 MBmmap (fastcache)State snapshot data for fast lookups.
stylus-lru-cache-capacity256 MBRust (via CGO)Compiled Stylus WASM modules.
tip

All of these caches are bounded by configuration and won't grow beyond their configured limits. This means total non-Go memory is predictable and can be calculated from your configuration.

Calculating GOMEMLIMIT

GOMEMLIMIT is an environment variable that sets a soft memory limit for the Go runtime. When set, Go's garbage collector (GC) runs more aggressively as heap usage approaches the limit, helping to keep total Go memory usage below the target. Without it, the GC relies solely on the GOGC environment variable (which defaults to 100, meaning the GC triggers when the heap doubles in size since the last collection) and has no awareness of an absolute memory ceiling.

For GOMEMLIMIT to work correctly in a containerized environment, you must reserve enough headroom for all the non-Go memory that competes for the container's memory limit.

Non-Go memory budget

Sum all memory that lives outside the Go heap:

Non-Go Memory =
database-cache # Pebble block cache (CGO)
+ (database-cache / 2) # Pebble memtables, max (CGO)
+ trie-clean-cache # fastcache (mmap)
+ snapshot-cache # fastcache (mmap)
+ stylus-lru-cache-capacity # Stylus WASM (Rust)
+ malloc arena overhead # glibc arenas
+ ~300 MB # Thread stacks (varies by workload)

With MALLOC_ARENA_MAX=2, arena overhead is ~128 MB. Without it, arena overhead can grow to several gigabytes depending on host CPU count. See glibc malloc arenas above.

Formula

GOMEMLIMIT = Container_Memory_Limit - Non_Go_Memory - Safety_Margin

You should use a safety margin of 300–500 MB to account for allocator overhead, transient allocations, and kernel page cache.

Example: 16 GB container with defaults

ComponentSizeSource
Pebble block cache2,048 MBdatabase-cache (CGO)
Pebble memtables (max)1,024 MBdatabase-cache / 2 (CGO)
Trie-clean cache600 MBtrie-clean-cache (fastcache)
Snapshot cache400 MBsnapshot-cache (fastcache)
Stylus WASM cache256 MBstylus-lru-cache-capacity (Rust)
Malloc arenas128 MBMALLOC_ARENA_MAX=2
Thread stacks300 MB*~2 MB per thread
Total non-Go4,756 MB

*Thread stack usage depends on the number of active threads, which varies by workload.

GOMEMLIMIT = 16,384 MB - 4,756 MB - 400 MB safety = ~11,228 MB ≈ 11 GB
caution

If GOMEMLIMIT is set too high (not accounting for non-Go memory), the Go garbage collector defers collection, expecting more room than actually exists. The OS then OOM-kills the process when total RSS (Go heap plus all non-Go allocations) exceeds the container limit.

Runtime memory protection

Even with well-sized caches and tuned GOMEMLIMIT, nodes can face OOM due to workload spikes or unexpected memory pressure. Nitro provides two memory protection mechanisms. These use Linux cgroups to monitor real-time container memory usage and intervene before the kernel's OOM killer takes action.

Both mechanisms read the container's cgroup memory files (v1 or v2). They compute memory usage while exlcuding page cache, while the kernel can reclaim. They compare the result against the container's memory limit minus a configurable free-memory threshold:

effective_usage = cgroup_memory_usage - (active_file_cache + inactive_file_cache)
threshold = cgroup_memory_limit - configured_free_limit
exceeded = effective_usage >= threshold

Subtracting active and inactive file cache from usage avoids false positives. The check excludes memory that the kernel can reclaim, which does not count as actual consumption.

note

Both features require Linux cgroups (v1 or v2). They only work inside containers or environments with groups limits. On bare-metal hosts without cgroup limits, these features are unavailable.

RPC throttling

This setting protects the node by rejecting incoming RPC requests with an HTTP 429 (Too Many Requests) status code when free memory drops below the configured threshold. It acts as a back-pressure mechanism, preventing new RPC work from pushing the node over its memory limit.

ParameterDefaultDescription
node.resource-mgmt.mem-free-limit"" (disabled)Minimum free memory required to accept RPC requests. Accepts values with suffixes: B, K/KB, M/MB, G/GB, T/TB.

When enabled, Nitro wraps the HTTP server with a middleware that checks free memory before every RPC call. If the limit is exceeded, the request is rejected immediately without being processed.

Example configuration:

--node.resource-mgmt.mem-free-limit=1GB

Nitro emits metrics under the arb/rpc/limitcheck/ and arb/memory/ namespaces. These emtrics track accepted and rejected requests, as well as current memory usage. At startup Nitro logs whether cgroup-based throttling is enabled and logs errors if memory checks fail during runtime.

Block validator throttling

This setting protects the node by pausing block validation when free memory drops below the configured threshold. Block validation is memory-intensive, so pausing it under memory pressure prevents the validation workload from triggering an OOM kill.

ParameterDefaultDescription
node.block-validator.memory-free-limit"default" (1 GB)Minimum free memory required to continue block validation. Set to "" (empty string) to disable. Accepts the same suffixes as the RPC setting.

This setting is enabled by default with a 1 GB threshold. When free memory drops below this amount, the block validator pauses recording new blocks and halts sending pending validations. Validation resumes once memory usage goes back below the threshold.

Example configuration:

To increase the threshold to 2 GB:

--node.block-validator.memory-free-limit=2GB

To disable the protection:

--node.block-validator.memory-free-limit=""

Nitro exposes the arb/validator/memory/limit_exceeded gauge metric (1 when paused, 0 otherwise) and logs error-level messages when validation is paused, or memory checks fail. Alert on this metric to detect when validation is paused due to memory pressure—sustained pauses may indicate that the node needs more memory or that cache sizes should be reduced.

Tuning recommendations

  1. Set MALLOC_ARENA_MAX=2: This is the single most impactful change for containerized nodes. Without it, glibc can waste gigabytes on arena overhead, causing RSS to drift upward over days. Set this environment variable on every Nitro container.

  2. Start from the formula: Calculate GOMEMLIMIT using the formula above with your actual cache configuration values. Do not set it to the container memory limit.

  3. Monitor RSS, not just Go heap: Set container memory alerts based on actual RSS (container_memory_rss in Prometheus / cAdvisor), not Go-reported memory.

  4. All caches are bounded: Unlike memory leaks, all non-Go memory in Nitro is bounded by configuration. With MALLOC_ARENA_MAX set, if RSS is stable and predictable, the node is behaving correctly. The memory is simply allocated outside Go's visibility.

  5. Enable RPC throttling on public-facing nodes: Set node.resource-mgmt.mem-free-limit (e.g., 1GB) on nodes that serve external RPC traffic. This prevents request surges from causing the node to go OOM. Monitor arb/rpc/limitcheck/failure to track how often throttling activates.

  6. Monitor block validator memory pauses: The block validator's memory protection is on by default. Alert on arb/validator/memory/limit_exceeded == 1 to detect when validation is paused due to memory pressure.