Fixing /admin/host/resume ObjectDisposedException #11264

brettsam · 2025-08-25T19:41:03Z

The bug here is that during a request, we register a DI child scope container with the request... then we dispose it mid-request, so when anything in ASP.NET tries to get a service later, it throws. It's a race b/c it's possible that the request returns before the old container is disposed. Despite this, everything seems to be working b/c all the hard work has already been done by the time the exception is thrown. The JobHost is restarted and everything continues on happily even though we return a 500 to the caller.

I have a preliminary PR that I've just pushed up but it's not truly complete. It'll work for what we need but it's quite tricky to get right with how DI works.

Some background:

For in-proc we use DryIoc and completely replace the DI engine to manage child scopes

We added this code to prevent disposal of the DI container while a child scope was still active:

azure-functions-host/src/WebJobs.Script.WebHost/DependencyInjection/ScopedResolver.cs

Lines 33 to 37 in b079776

    
           Task.WhenAny(childScopeTasks, Task.Delay(5000)) 
        
               .ContinueWith(t => 
        
               { 
        
                   Container.Dispose(); 
        
               }, TaskContinuationOptions.ExecuteSynchronously);

When we moved to out-of-proc, we removed DryIoc and use the default Microsoft.Extensions.DependencyInjection and did not add this "child scope tracking" behavior.
There's not a great way to use the decorator pattern here. There's a bunch of ways to bypass the decorator (e.g. with constructor injection) and use the original IServiceProvider/IServiceScopeFactory because they're hard-coded and not able to be overwritten:
- this issue links to other discussions: [DI] Is there a way to override IServiceScopeFactory in the native ServiceProvider? dotnet/runtime#38240
- here's the hard-coding: https://github.com/dotnet/runtime/blob/main/src/libraries/Microsoft.Extensions.DependencyInjection/src/ServiceProvider.cs#L63-L66

What I've got should be good enough for the very specific scenario we need it for, but there still seems like there's gaps that may bite us later.

Want to discuss with @fabiocav later whether we should use this same approach? Or abandon it and just skip registering services for /admin calls? Or if there's other approaches.

Will leave it in Draft for now.

Pull request checklist

IMPORTANT: Currently, changes must be backported to the in-proc branch to be included in Core Tools and non-Flex deployments.

Backporting to the in-proc branch is not required
- Otherwise: Link to backporting PR
My changes do not require documentation changes
- Otherwise: Documentation issue linked to PR
My changes should not be added to the release notes for the next release
- Otherwise: I've added my notes to release_notes.md
My changes do not need to be backported to a previous version
- Otherwise: Backport tracked by issue/PR #issue_or_pr
My changes do not require diagnostic events changes
- Otherwise: I have added/updated all related diagnostic events and their documentation (Documentation issue linked to PR)
I have added all required tests (Unit tests, E2E tests)

jviau · 2025-08-26T15:43:04Z

test/WebJobs.Script.Tests.Integration/WebHostEndToEnd/DrainModeResumeEndToEndTests.cs

+            {
+                // This forces the hosts to be stopped and disposed before a new one starts.
+                // There was a bug hiding here originally, so we'll run all these tests this way.
+                { ConfigurationSectionNames.SequentialJobHostRestart, "true" }


I believe this behavior has some major differences. It doesn't call something regarding worker process cleanup. Any concern with missing that logic in this test?

This is the only place that I see it's called

azure-functions-host/src/WebJobs.Script.WebHost/WebJobsScriptHostService.cs

Lines 633 to 637 in 1bb6bd8

if (ShouldEnforceSequentialRestart())

{

stopTask = Orphan(previousHost, cancellationToken);

await stopTask;

startTask = UnsynchronizedStartHostAsync(activeOperation);

Which is awaiting the call to Orphan() so that we're guaranteed the previous host is disposed before we start another one. Otherwise we fire-and-forget that call.

The bug here is a race and it always passes for me locally (and apparently in CI) because the new host is started and the request returns while we're still waiting to dispose the orphaned one.

brettsam · 2025-08-26T19:10:02Z

Added a bunch more context to the description; leaving this in Draft now until I can discuss with @fabiocav. It's possible this was a purposeful omission and he had another plan (or maybe not :-))

brettsam added 3 commits August 25, 2025 12:38

making sure test fails in CI

8626452

fixing build

4bd037f

ensuring error message is right in CI

6786165

jviau reviewed Aug 26, 2025

View reviewed changes

preliminary fix; not ready yet.

79b4aff

header

76e746f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fixing /admin/host/resume ObjectDisposedException #11264

Fixing /admin/host/resume ObjectDisposedException #11264

Uh oh!

brettsam commented Aug 25, 2025 •

edited

Loading

Uh oh!

jviau Aug 26, 2025

Uh oh!

brettsam Aug 26, 2025

Uh oh!

brettsam commented Aug 26, 2025

Uh oh!

Uh oh!

	Task.WhenAny(childScopeTasks, Task.Delay(5000))
	.ContinueWith(t =>
	{
	Container.Dispose();
	}, TaskContinuationOptions.ExecuteSynchronously);

	if (ShouldEnforceSequentialRestart())
	{
	stopTask = Orphan(previousHost, cancellationToken);
	await stopTask;
	startTask = UnsynchronizedStartHostAsync(activeOperation);

Fixing /admin/host/resume ObjectDisposedException #11264

Are you sure you want to change the base?

Fixing /admin/host/resume ObjectDisposedException #11264

Uh oh!

Conversation

brettsam commented Aug 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull request checklist

Uh oh!

jviau Aug 26, 2025

Choose a reason for hiding this comment

Uh oh!

brettsam Aug 26, 2025

Choose a reason for hiding this comment

Uh oh!

brettsam commented Aug 26, 2025

Uh oh!

Uh oh!

brettsam commented Aug 25, 2025 •

edited

Loading