Table of Contents

Performance Tuning

Connection pool sizing

The default MaxConnectionsPerServer = 100 suits most workloads. Tune based on your observed polar.requests.inflight gauge:

  • If inflight regularly exceeds 80% of MaxConnectionsPerServer, increase the limit
  • If inflight is consistently low, reduce to reclaim OS socket resources
{
  "PolarSharp": {
    "Connection": {
      "MaxConnectionsPerServer": 200
    }
  }
}

DNS rotation (PooledConnectionLifetime)

Cloud load balancers rotate backend IPs continuously. Without PooledConnectionLifetime, HTTP/1.1 connections stick to a stale IP for hours.

The default 15-minute lifetime forces TCP reconnection at that interval, ensuring requests reach the current backend:

{
  "PolarSharp": {
    "Connection": {
      "PooledConnectionLifetimeMinutes": 15
    }
  }
}

Lower this value (e.g., 5 min) if Polar's LB rotates more aggressively. Do not set below 1 minute — frequent reconnections waste TLS handshake CPU.

HTTP/2 multiplexing

EnableHttp2: true (default) allows one TCP connection to carry hundreds of concurrent HttpClient requests as independent streams. For burst traffic, this reduces connection pool usage by 10× compared to HTTP/1.1.

EnableMultipleHttp2Connections: true (default) allows additional TCP connections if the single HTTP/2 connection is saturated — combines multiplexing and parallelism.

HTTP/3 (QUIC) — opt-in

HTTP/3 over QUIC eliminates TCP head-of-line blocking and reduces connection setup time. Enable experimentally:

{
  "PolarSharp": {
    "Connection": {
      "EnableHttp3": true
    }
  }
}

Measure before committing — QUIC benefits depend on network conditions and Polar's server-side support.

Hedging for latency-sensitive reads

P99 latency to Polar can be 10× P50 (tail latency). Hedging sends a duplicate GET request after a short delay and uses whichever response arrives first, cutting P99 by 60–80% at the cost of ~5% extra request volume:

{
  "PolarSharp": {
    "Resilience": {
      "HedgeAfterMs": 200,
      "HedgeMaxAttempts": 2
    }
  }
}

Hedging is applied only to GET and HEAD requests — never to mutating verbs.

Per-tenant bulkhead

Without PolarSharp.MultiTenant, all requests share one connection pool and one circuit breaker. With per-tenant isolation, each tenant gets its own resources:

  • One tenant's circuit-breaker state never affects another tenant's requests
  • One tenant's high throughput doesn't starve others' connection slots

polar.requests.inflight gauge

Monitor this UpDownCounter<long> (tagged by polar.tenant_id and polar.resource) to detect saturation before it becomes a problem:

polar.requests.inflight{polar.tenant_id="acme", polar.resource="orders"} = 87

If this approaches MaxConnectionsPerServer, requests will start queuing behind the pool and latency will increase. Alert at 80% of the configured limit.

Channel capacity tuning

Toast notification channel (IPolarToastChannel) uses BoundedChannelFullMode.DropOldest. Tune ChannelCapacity based on your UI's read speed:

  • Blazor Server with persistent circuit: 50–100 (reads in real time)
  • SignalR broadcast with occasional disconnections: 200–500
  • SSE with frequent client reconnections: 500

Monitor polar.channel.depth{name="toast"} and alert above 80% of capacity.

Startup JIT warmup

The first Polar API call in a JIT-compiled app is slow because hot paths haven't been compiled yet. Enable optional warmup:

{
  "PolarSharp": {
    "WarmupOnStartup": true
  }
}

Adds ~100ms to startup; first user-facing request feels as fast as steady state. No-op in Native AOT builds.

Native AOT

Native AOT (PublishAot=true) eliminates JIT startup latency entirely. PolarSharp is fully AOT-compatible — all code paths are reflection-free. CI verifies this with zero-warning AOT publish on every PR.

BenchmarkDotNet results

Run benchmarks locally:

dotnet run --project tests/PolarSharp.Benchmarks -c Release

Key targets:

Benchmark Target
Per-call overhead vs raw HttpClient < 5 ms (P50)
Webhook HMAC verification (50 KB) < 2 ms (P99)
Multi-tenant client lookup (cache hit) < 100 ns (P99)
100 tenants × 100 parallel calls Linear scaling