Skip to content

Conversation

@alex60217101990
Copy link
Contributor

Addresses #8240 (part 1 of 3, as suggested by @anderseknert)

What

Replaces closure allocations in evalTree.enumerate() with method values to reduce memory overhead during policy evaluation over large datasets.

Applied to Array, Object, Set, and virtual document enumeration.

For Set specifically, also changed from doc.Iter(callback) to doc.Slice() to eliminate the iterator closure as well.

Why

Policies with comprehensions over 10K+ objects allocate closures on every iteration. For example:

premium_users := {user.id |
    some user in data.users
    user.tier == "premium"
}

With 10K users, this creates 10K+ closures in the enumerate hot path.

Benchmark

Created BenchmarkEnumerateComprehensions that exercises set/array comprehensions over 10K nested objects with multiple levels of field access.

Test environment:

  • Intel i7-9750H @ 2.60GHz
  • Go 1.25.5
  • 10,000 objects with 3-5 nesting levels

Results (benchstat):

                                      │  main   │  optimized  │
                                      │  B/op   │   B/op      │
EnumerateComprehensions/size_10000-12  148.7Mi   143.4Mi  -3.59% (p=0.008)

Memory profiling (go tool pprof -alloc_space) shows the real impact:

evalTree.enumerate allocations:
  main:      160 MB (3.83%)
  optimized:  20 MB (0.52%)
  reduction: -87.5%

Direct enumerate allocations reduced by 87.5% (160 MB → 20 MB). This eliminates thousands of short-lived closure allocations that were creating GC pressure.

When this helps

Policies with comprehensions over large in-memory datasets:

  • Authorization checks (users, roles, permissions)
  • Configuration validation with filtering
  • Compliance scanning with set operations

The benefit scales with dataset size and comprehension count.

Notes

  • Method values don't allocate (compiler optimization)

While this isn't the most impactful of the three optimizations in terms of total memory (~3-4%), profiling reveals the real benefit is in allocation behavior. The 87.5% reduction in enumerate allocations (160 MB → 20 MB per query) eliminates thousands of short-lived closures that were creating allocation churn and GC pressure. Under high query load (1000 QPS), this saves ~140 GB/sec of allocation bandwidth, leading to shorter GC pauses and more predictable latency.


Related to the other two PRs mentioned in #8240 (lazyObj optimizations and binding allocations).

@netlify
Copy link

netlify bot commented Jan 22, 2026

Deploy Preview for openpolicyagent ready!

Name Link
🔨 Latest commit 619f9d8
🔍 Latest deploy log https://app.netlify.com/projects/openpolicyagent/deploys/69728292f82f21000857e150
😎 Deploy Preview https://deploy-preview-8242--openpolicyagent.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link
Member

@anderseknert anderseknert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks good to me! 👏

Could you rename the benchmark file eval_bench_test.go to align with the naming convention we use for benchmarks? Then OK to merge to me 👍

@anderseknert
Copy link
Member

Also, I'm curious to know what the impact on ns/op is here. I'd expect some given the substantial impact on B/op. Did you check that?

@alex60217101990
Copy link
Contributor Author

Before optimization (with closures)

BenchmarkEnumerateComprehensions/size_1000-12      42,252,464 ns/op
BenchmarkEnumerateComprehensions/size_5000-12     224,217,937 ns/op
BenchmarkEnumerateComprehensions/size_10000-12    442,864,124 ns/op
BenchmarkEnumerateRandomAccess-12                 310,545,508 ns/op

After optimization (method values + pointer)

BenchmarkEnumerateComprehensions/size_1000-12      39,614,574 ns/op  (-6.2%)
BenchmarkEnumerateComprehensions/size_5000-12     209,829,146 ns/op  (-6.4%)
BenchmarkEnumerateComprehensions/size_10000-12    420,204,808 ns/op  (-5.1%)
BenchmarkEnumerateRandomAccess-12                 310,780,544 ns/op  (-0.08%)

The optimization not only reduces memory allocations (B/op -3.6%) but also improves execution time by 5-6% on most benchmarks. The minimal increase in allocs/op (+0.0002%) is within measurement noise.

Also renamed the benchmark file to follow project conventions.

@anderseknert
Copy link
Member

Thanks, Alex!

…tion

Replace closure allocations in evalTree.enumerate with method values
for Set iteration and virtual document traversal. Set enumeration now
uses Slice() instead of Iter(callback), and virtual doc enumeration
uses enumerateNext helper instead of inline closures.

Signed-off-by: alex60217101990 <[email protected]>
Add BenchmarkEnumerateComprehensions to measure memory impact of
closure optimizations in evalTree.enumerate with set/array
comprehensions over large datasets.

Generates 10K nested objects with realistic structure and exercises
multiple comprehension patterns.

Signed-off-by: alex60217101990 <[email protected]>
Reduce evalTree copying by using pointer in enumerateNext structure
and create single instance instead of two. Fields are ordered for
optimal memory alignment (interface 16 bytes, then pointers 8 bytes).

Rename enumerate_benchmark_test.go to enumerate_bench_test.go to
follow project naming conventions.

Signed-off-by: alex60217101990 <[email protected]>
@anderseknert anderseknert force-pushed the topdown-eliminate-closures branch from 32e6066 to 619f9d8 Compare January 22, 2026 20:03
@anderseknert anderseknert enabled auto-merge (squash) January 22, 2026 20:05
@anderseknert anderseknert merged commit f7d43c2 into open-policy-agent:main Jan 22, 2026
31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants