-
-
Notifications
You must be signed in to change notification settings - Fork 113
Async Fetch Refactor #880
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Async Fetch Refactor #880
Conversation
ikreymer
commented
Aug 25, 2025
- separate out reading stream response while browser is waiting (not really async) from actual async loading, this is not handled via fetchResponseBody()
- unify async fetch into first trying browser networking for regular GET, fallback to regular fetch()
- load headers and body separately in async fetch, allowing for cancelling request after headers
- refactor direct fetch of non-html pages: load headers and handle loading body, adding page async, allowing worker to continue loading browser-based pages (should allow more parallelization in the future)
- unify WARC writing in preparation for dedup: unified serializeWARC() called for all paths, WARC digest computed, additional checks for payload added for streaming loading
- single AsyncFetcher tries both browser + direct fetch - separate loadHeaders() and loadBody() - direct fetch page: try loadHeaders(), queue loadDirectPage() to be done async and finish page
// not yet finished | ||
if (data.asyncLoading) { | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume we'd end up here if the page worker timeout is hit (or the worker crashes) before the page has finished loading async. Is there any tidying up we want to do in that case rather than just returning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Working very well in testing, haven't noticed any regressions. Thanks for the test updates as well.