Skip to content

Allow the global allocator to use thread-local storage and std::thread::current() #144465

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

orlp
Copy link
Contributor

@orlp orlp commented Jul 25, 2025

Currently the thread-local storage implementation uses the Global allocator if it needs to allocate memory in some places. This effectively means the global allocator can not use thread-local variables. This is a shame as an allocator is precisely one of the locations where you'd really want to use thread-locals. We also see that this lead to hacks such as #116402, where we detect re-entrance and abort.

So I've made the places where I could find allocation happening in the TLS implementation use the System allocator instead. I also applied this change to the storage allocated for a Thread handle so that it may be used care-free in the global allocator as well, for e.g. registering it to a central place or parking primitives.

r? @joboet

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jul 25, 2025
@orlp orlp changed the title Use System allocator for thread-local storage Use System allocator for thread-local storage and std::thread::Thread Jul 25, 2025
@Mark-Simulacrum Mark-Simulacrum added the I-libs-api-nominated Nominated for discussion during a libs-api team meeting. label Jul 26, 2025
@Mark-Simulacrum
Copy link
Member

I think this merits some libs-api discussion (unless that's already happened? I didn't quickly find it). Landing this IMO implies at least an implicit guarantee (even if not necessarily stable without actual docs) that we don't introduce global allocator usage in thread local code. I think we should have some discussion and either commit to that or not. And we should discuss where we draw the line (e.g., is other runtime-like code in std supposed to do this? For example, environment variables or other bits that need allocation early, potentially even before main starts -- is that OK to call into the allocator for?)

OTOH, if we can make a weaker guarantee here while still serving the purpose, that may also be good to look at? For example, can we guarantee that a pattern like dhat's for ignoring re-entrant alloc/dealloc calls is safe? i.e., that we don't need allocations for non-Drop thread locals? Or if we can only do that for a limited set, perhaps std could define a public thread local for allocators to use?

@orlp
Copy link
Contributor Author

orlp commented Jul 26, 2025

@Mark-Simulacrum

I think this merits some libs-api discussion (unless that's already happened? I didn't quickly find it). Landing this IMO implies at least an implicit guarantee (even if not necessarily stable without actual docs) that we don't introduce global allocator usage in thread local code.

Yes, this should probably be properly discussed.

And we should discuss where we draw the line (e.g., is other runtime-like code in std supposed to do this? For example, environment variables

You make a good point here about environment variables. It is very common to allow configuration and debugging of allocators through them. I'd like to at least guarantee that std::env::var_os does not result in allocation through the global allocator, I will look into that to see how feasible that is.

or other bits that need allocation early, potentially even before main starts -- is that OK to call into the allocator for?)

I think it's good to brainstorm a bit what would be needed for a global allocator, although I don't think an allocator needs all that much from std. Environment variables, thread locals, getting a thread handle comes to mind, and I guess spawning a thread is also rather important for creating background cleanup. I will also investigate that.

OTOH, if we can make a weaker guarantee here while still serving the purpose, that may also be good to look at?

I don't think there's much difference in constraint for the stdlib between guaranteeing this for all thread_local instances as opposed to only non-Drop thread locals, especially on platforms where every single thread-local already does an allocation regardless.

For example, can we guarantee that a pattern like dhat's for ignoring re-entrant alloc/dealloc calls is safe?

Not without putting undue burden on the allocator. dhat doesn't actually change how allocation works, so regardless of re-entrance it calls System.alloc(layout) or System.dealloc(ptr, layout). If it was actually changing allocation method depending on re-entrance it would need to keep track of which pointers were allocated normally, and which were allocated re-entrant, and call the appropriate dealloc function for each. This makes all deallocation significantly slower regardless of re-entrance, in addition to adding extra complexity.

@orlp
Copy link
Contributor Author

orlp commented Jul 26, 2025

I did some investigation into some of the problems of making std::thread::spawn global-allocator-free:

  • You can't give the thread a name because the interface for setting a name inherently takes a String. Perhaps an extra method could be added static_name which takes a &'static str, and the internal field changed to a Cow.

  • The RUST_MIN_STACK environment variable gets parsed to an integer if not explicitly specified. So this means var_os needs to be global-allocator-free. This is something I'd want regardless. This is impossible, so any call would need to explicitly specify the stack size.

  • no_spawn_hooks almost surely has to be set.

  • The Arc containing the Packet needs to be made to use System.

  • The Box containing the main function needs to be made to use System.

The above seems feasible to me, although I also had another thought: I don't know if std::thread::spawn needs to be global-allocator-free. The only use-case I can think of for spawning thread in the global allocator is for spawning cleanup threads, and I think it's perfectly fine if this happens after first init:

if !alloc_is_init() {
    init_alloc();
    // Now the allocator works and may be re-entrant.
    spawn_background_threads();
}

So I'm perfectly happy if we don't make std::thread::spawn global-allocator-free.


While investigating the above I also found the following additional blocker for making std::thread::current() global-allocator-free I didn't realize before: ThreadId::new on platforms which don't have 64-bit atomics uses a Mutex, so Mutex would have to be global-allocation-free (likely unfeasible), or this implementation changed. Which lead me to ask:

It's possible to do this with a fairly simple spinlock, but are there platforms we support that have threads and Mutex but not an atomic we can use for a spinlock?

To answer my own question: std requires the existence of AtomicBool.

@orlp
Copy link
Contributor Author

orlp commented Jul 26, 2025

I've added a commit to use a spinlock for ThreadId if 64-bit atomics are unavailable, and with that I believe std::thread::current() should be safe to call in a global allocator.

I also investigated environment variables and... it's hopeless without a new API. The current API returns owned Strings or OsStrings both of which allocate with the global allocator. Currently if the global allocator wants to read environment variables it'll have to use libc::getenv (and thus bypass Rust's lock). I don't think that's the end of the world.

@orlp orlp changed the title Use System allocator for thread-local storage and std::thread::Thread Allow the global allocator to use thread-local storage and std::thread::current() Jul 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-libs-api-nominated Nominated for discussion during a libs-api team meeting. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants