Skip to content

Conversation

LaurenceJJones
Copy link
Contributor

Very experimental, tested under heavy load, current code happy but can DRY.

What this adds

  • Adaptive autoscaling for parser, buckets, and outputs stages (feature-flagged)
  • Buffered hot-path channels and small buffer per-bucket
  • Backoff in bucket pour to avoid busy-spin
  • Demoted noisy “stuck for X sending” log

How to enable

  • Enable autoscaling via env or feature.yaml:
    • Env:
      • CROWDSEC_FEATURE_PARSERS_AUTOSCALE=true
      • CROWDSEC_FEATURE_BUCKETS_AUTOSCALE=true
      • CROWDSEC_FEATURE_OUTPUTS_AUTOSCALE=true
    • feature.yaml:
      • parsers_autoscale
      • buckets_autoscale
      • outputs_autoscale

How to test

  • Start the agent normally, then generate a burst of logs.
  • Observe scale events in logs:
    • “autoscale(parser): scale up to N …”
    • “autoscale(buckets): scale up to N …”
    • “autoscale(outputs): scale up to N …”
  • Let load subside; parser/buckets should scale down gradually (not below initial). Outputs stop immediately on downscale request.
  • Verify no warning spam about “stuck for … sending event …”.
  • Check that overflow events still appear and alerts get pushed.
  • Optionally monitor CPU: under sustained load you should see reduced busy-waiting compared to before.

Changes (high-level)

  • Buffered channels:
    • inputLineChan, inputEventChan sized to downstream workers.
    • Overflow channel (LoadBuckets) sized to output workers.
    • Per-bucket In channel now buffered (4).
  • Autoscaling scaffold:
    • New StagePool with scale-up/down logic and min floor.
    • Parser: controlled by parsers_autoscale, idle-based downscale.
    • Buckets: controlled by buckets_autoscale, idle-based downscale.
    • Outputs: controlled by outputs_autoscale, immediate stop on downscale.
  • Pour path:
    • Exponential backoff on full bucket.In to avoid CPU spin.
    • Suppressed/demoted “stuck for X sending …” log.
  • Feature flags:
    • Added parsers_autoscale, buckets_autoscale, outputs_autoscale.

Copy link

@LaurenceJJones: There are no 'kind' label on this PR. You need a 'kind' label to generate the release automatically.

  • /kind feature
  • /kind enhancement
  • /kind refactoring
  • /kind fix
  • /kind chore
  • /kind dependencies
Details

I am a bot created to help the crowdsecurity developers manage community feedback and contributions. You can check out my manifest file to understand my behavior and what I can do. If you want to use this for your project, you can check out the BirthdayResearch/oss-governance-bot repository.

Copy link

@LaurenceJJones: There are no area labels on this PR. You can add as many areas as you see fit.

  • /area agent
  • /area local-api
  • /area cscli
  • /area appsec
  • /area security
  • /area configuration
Details

I am a bot created to help the crowdsecurity developers manage community feedback and contributions. You can check out my manifest file to understand my behavior and what I can do. If you want to use this for your project, you can check out the BirthdayResearch/oss-governance-bot repository.

@LaurenceJJones LaurenceJJones marked this pull request as draft August 14, 2025 16:39
Copy link

codecov bot commented Aug 14, 2025

Codecov Report

❌ Patch coverage is 37.66667% with 187 lines in your changes missing coverage. Please review.
✅ Project coverage is 61.40%. Comparing base (be436d8) to head (7f7ddb2).

Files with missing lines Patch % Lines
cmd/crowdsec/stage_pool.go 26.88% 65 Missing and 3 partials ⚠️
cmd/crowdsec/parse.go 16.21% 28 Missing and 3 partials ⚠️
cmd/crowdsec/pour.go 15.15% 25 Missing and 3 partials ⚠️
cmd/crowdsec/crowdsec.go 66.66% 16 Missing and 11 partials ⚠️
cmd/crowdsec/serve.go 40.00% 10 Missing and 2 partials ⚠️
pkg/fflag/crowdsec.go 25.00% 6 Missing and 3 partials ⚠️
pkg/leakybucket/manager_run.go 35.71% 9 Missing ⚠️
cmd/crowdsec/output.go 25.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3809      +/-   ##
==========================================
- Coverage   61.61%   61.40%   -0.22%     
==========================================
  Files         405      406       +1     
  Lines       41569    41832     +263     
==========================================
+ Hits        25614    25685      +71     
- Misses      13847    14013     +166     
- Partials     2108     2134      +26     
Flag Coverage Δ
bats 45.43% <36.66%> (-0.08%) ⬇️
unit-linux 34.27% <3.66%> (-0.22%) ⬇️
unit-windows 24.19% <3.66%> (-0.17%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@LaurenceJJones
Copy link
Contributor Author

/kind enhancement
/area agent

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/agent kind/enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant