-
Notifications
You must be signed in to change notification settings - Fork 540
pkg/acquisition/modules/victorialogs: properly handle retry logic for "mode: tail" #3924
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
If the original HTTP get for the tail endpoint is successful, but then the connection is lost, no retry is done. I updated the Tail method to also retry in this case.
Upon review, the added code was pretty different in the approach used to keep retrying compared to the approach for the QueryRange method. I updated the method to create a new doTail that has the same style as doQueryRange and updated Tail to use it. This has the following effects: - doTail will keep trying after losing a connection - the retry interval will grow (with an upper limit) and shrink (with a lower limit) as connections are made and broken - the time in the request is updated to avoid overlapping with previous data that was returned (missing in the first fix)
Bringing in changes from master.
Keeping up with changes from origin
The use of the ticker was unnecessary. I updated doTail to use a backoff interval with time.After.
Keeping the fork up to date with the origin repository.
The tail query endpoint does not support start, but start_offset. I updated the doTail method to use this parameter, and calculate the required value each time the query is attempted based on the desired start time for the results returned from the query.
…"since" paramater Simplify code of backoff handling in order to reduce duplication. Also do not sleep when making the first request in order to avoid artificial delay for startup. While at it, implement proper handling of "since" parameter. Previously, it was reset to 0 and ignored in "tail" mode since VL API did not support tailing results from the past. Implemented full support to use "since" value when performing initial tailing and keeping track of last seen log item in order to not miss log lines when retrying the request.
@zekker6: There are no 'kind' label on this PR. You need a 'kind' label to generate the release automatically.
DetailsI am a bot created to help the crowdsecurity developers manage community feedback and contributions. You can check out my manifest file to understand my behavior and what I can do. If you want to use this for your project, you can check out the BirthdayResearch/oss-governance-bot repository. |
@zekker6: There are no area labels on this PR. You can add as many areas as you see fit.
DetailsI am a bot created to help the crowdsecurity developers manage community feedback and contributions. You can check out my manifest file to understand my behavior and what I can do. If you want to use this for your project, you can check out the BirthdayResearch/oss-governance-bot repository. |
/kind fix |
Update parameters description in sync with crowdsecurity/crowdsec#3924
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #3924 +/- ##
=======================================
Coverage 61.63% 61.63%
=======================================
Files 407 407
Lines 41864 41887 +23
=======================================
+ Hits 25801 25816 +15
- Misses 13936 13940 +4
- Partials 2127 2131 +4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
- read out channel in order to avoid blocking on client - do not treat context.Cancelled as a fatal error as client propagates it
What this PR does
Fixes: #3653
PR implements proper retry mechanism for VictoriaLogs acquisition when using
mode: tail
. Previously it was unable to recover automatically and required service restart to continue data processing.Additional context
This PR is based on work of @thebondo from #3654.
The difference between base PR is some final polishing to simplify error handling logic and fix support of
since
parameter as it is now possible with newer API versions - 1bb425cDraft of docs update to reflect changes of this PR: crowdsecurity/crowdsec-docs#895