Skip to content

Implement hash_map macro #144070

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

stifskere
Copy link

Implementation of #144032

Implements the hash_map macro under std/src/macros.rs.

@rustbot
Copy link
Collaborator

rustbot commented Jul 17, 2025

r? @thomcc

rustbot has assigned @thomcc.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jul 17, 2025
@rustbot

This comment has been minimized.

@rustbot rustbot added has-merge-commits PR has merge commits, merge with caution. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jul 17, 2025
@stifskere stifskere force-pushed the feat/macros/hash_map branch from 3d020f6 to b5692af Compare July 17, 2025 12:57
@rustbot rustbot removed has-merge-commits PR has merge commits, merge with caution. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jul 17, 2025
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@NyxCode
Copy link

NyxCode commented Jul 17, 2025

Wouldn't using one of HashMap::with_capacity(), HashMap::reserve(), FromIterator<(K, V)>, Extend<(K, V)> or From<[(K, V); N]> be better, avoiding unnecessary allocations?

@stifskere
Copy link
Author

stifskere commented Jul 17, 2025

Wouldn't using one of HashMap::with_capacity(), HashMap::reserve(), FromIterator<(K, V)>, Extend<(K, V)> or From<[(K, V); N]> be better, avoiding unnecessary allocations?

I was thinking about HashMap::with_capacity() but I couldn't see how I could convert a repetition into a number.

From... would essentially be the same, just reallocation, so for clarity I just expanded into insert.

I could see that 0 $(+ 1)* could be evaluated constantly by the compiler, thus we could get a size, but I'm not sure if it's the best way.
It could not be the best way, because that expansion doesn't at all include a reference to the repetition declaration, so it can't be. There is no logical way of getting the size with a macro_rules unless with more complexity we make a recursive macro, which would limit the hash_map size to 254 without extra compiler parameters.

I will add a question about the size in the tracking issue.

@NyxCode
Copy link

NyxCode commented Jul 17, 2025

All the options i mentioned should be equivalent to with_capacity as far as I can tell, though I don't know if the optimizer can look through all options equally well (e.g. wrt. stack usage).
Both these should work, for example:

let len = 1 $(+ {stringify!($k); 1})*;
let mut map = HashMap::with_capacity(len);
$( map.insert($k, $v); )*
HashMap::from([$(($k, $v)),*])

@stifskere
Copy link
Author

stifskere commented Jul 18, 2025

All the options i mentioned should be equivalent to with_capacity as far as I can tell, though I don't know if the optimizer can look through all options equally well (e.g. wrt. stack usage). Both these should work, for example:

let len = 1 $(+ {stringify!($k); 1})*;
let mut map = HashMap::with_capacity(len);
$( map.insert($k, $v); )*
HashMap::from([$(($k, $v)),*])

I think I'd rather go for intrinsics, except since HashMap is a wrapper, I don't think there are many. I feel that if someone decides to expand the macro via IDE or cargo expand, they will find a lot of strings and be like "???". Probably open an issue later.

Otherwise, we could generate a constant count probably, the compiler should optimize this, and it should not have runtime cost, but a bit extra of constant cost, which may mean that if there are a lot of invocations the compiler might go a bit slower.

macro_rules! hash_map {
    () => {{
        ::std::collections::HashMap::new()
    }};

    ( $( $key:expr => $value:expr ),* $(,)? ) => {{
        let mut map = ::std::collections::HashMap::with_capacity(
            const { [$({ let _ = &$key; () }),*].len() }
        );

        $( map.insert($key, $value); )*
        map
    }}
}
image

@Noratrieb
Copy link
Member

I don't think intrinsics make sense here, this is a library type. I think the repeat insert method makes the most sense as it doesn't have the potential stack usage problem of from. Repeat insert is also how extend is implemented under the hood: https://github.com/rust-lang/hashbrown/blob/670213fb32d208759d0331996894e05604de0c18/src/map.rs#L4562.

with_capacity definitely makes sense though. You can count the elements like this: https://lukaswirth.dev/tlborm/decl-macros/building-blocks/counting.html

@Noratrieb
Copy link
Member

Oh actually you might be able to use the unstable macro_metavar_expr_count for the counting too, I forgot about that :D

@stifskere
Copy link
Author

stifskere commented Jul 18, 2025

I don't think intrinsics make sense here, this is a library type. I think the repeat insert method makes the most sense as it doesn't have the potential stack usage problem of from. Repeat insert is also how extend is implemented under the hood: https://github.com/rust-lang/hashbrown/blob/670213fb32d208759d0331996894e05604de0c18/src/map.rs#L4562.

with_capacity definitely makes sense though. You can count the elements like this: https://lukaswirth.dev/tlborm/decl-macros/building-blocks/counting.html

I would go for the slice length approach, but resolved constantly, which is technically what I'm doing just that instead of making a variable using replace_expr!

So probably

macro_rules! hash_map {
    () => {{
        ::std::collections::HashMap::new()
    }};

    ( $( $key:expr => $value:expr ),* $(,)? ) => {{
        let mut map = ::std::collections::HashMap::with_capacity(
            const { [$($crate::hash_map!(@internal count $key)),*].len() }
        );

        $( map.insert($key, $value); )*
        map
    }};

    (@internal count $i:tt) => {()}
}

This would work best if there is some sort of thing inside the standard library so that the last branch is only used internally in an explicit way, which would also filter out the usage inside the second branch.

@Noratrieb
Copy link
Member

Oh actually you might be able to use the unstable macro_metavar_expr_count for the counting too, I forgot about that :D

You should try this, which I only remembered after posting the first message

@stifskere
Copy link
Author

Oh actually you might be able to use the unstable macro_metavar_expr_count for the counting too, I forgot about that :D

You should try this, which I only remembered after posting the first message

The problem with this is that it would delay stabilization a lot, so I'm not sure, I don't think adding unstable to unstable helps

@Noratrieb
Copy link
Member

You're allowed to use unstable features internally with #[allow_internal_unstable()] and listing out the feature names there, on the macro.

@tmke8
Copy link

tmke8 commented Jul 19, 2025

Would it be possible to use HashMap::default() instead of HashMap::new() so that this macro can be used with non-default hashing algorithms as well? (Though this might cause type inference problems...)

EDIT: or rather

HashMap::with_capacity_and_hasher(capacity, Default::default())

for specifying capacity.

@stifskere
Copy link
Author

You're allowed to use unstable features internally with #[allow_internal_unstable()] and listing out the feature names there, on the macro.

Ok, I will check that now

@Noratrieb
Copy link
Member

In this case you're just missing another allow_internal_unstable on your repetition_utils, since allow_internal_unstable isn't transitive:

diff --git a/library/std/src/macros.rs b/library/std/src/macros.rs
index 7055168e0a6..455d0ca9e24 100644
--- a/library/std/src/macros.rs
+++ b/library/std/src/macros.rs
@@ -383,6 +383,7 @@ macro_rules! dbg {
 #[doc(hidden)]
 #[macro_export]
 #[unstable(feature = "hash_map_internals", issue = "none")]
+#[allow_internal_unstable(hash_map_internals)]
 macro_rules! repetition_utils {
     (@count $($tokens:tt),*) => {{
         [$($crate::repetition_utils!(@replace $tokens => ())),*].len()

This makes it work for me.

@rustbot

This comment has been minimized.

@stifskere
Copy link
Author

In this case you're just missing another allow_internal_unstable on your repetition_utils, since allow_internal_unstable isn't transitive:

diff --git a/library/std/src/macros.rs b/library/std/src/macros.rs
index 7055168e0a6..455d0ca9e24 100644
--- a/library/std/src/macros.rs
+++ b/library/std/src/macros.rs
@@ -383,6 +383,7 @@ macro_rules! dbg {
 #[doc(hidden)]
 #[macro_export]
 #[unstable(feature = "hash_map_internals", issue = "none")]
+#[allow_internal_unstable(hash_map_internals)]
 macro_rules! repetition_utils {
     (@count $($tokens:tt),*) => {{
         [$($crate::repetition_utils!(@replace $tokens => ())),*].len()

This makes it work for me.

Done, I added it, let's see if tests run correctly now.

By the way, what's the way you test whether tests fail and whether stability is correct locally? Because I would like to do the same instead of relying on running the workflows. I have read some about the ./x bootstrap, I need to figure out how to configure it locally.

@Noratrieb
Copy link
Member

I can recommend reading this to see how to build and run the compiler with a standard library: https://rustc-dev-guide.rust-lang.org/building/quickstart.html
I did ./x build --stage 1 library to build the compiler and have it linked up as mentioned in the guide to then be able to do rustc +dev mytest.rs to test building the file (where I invoke the macro).

@Noratrieb
Copy link
Member

(somehow github doesn't show it nicely despite me leaving it as a review comment but i added a comment here in case it got lost in the ui: #144070 (comment))

@stifskere
Copy link
Author

(somehow github doesn't show it nicely despite me leaving it as a review comment but i added a comment here in case it got lost in the ui: #144070 (comment))

I believe it didn't show up because it was solved, I didn't see the comment at all, If you could re-review the actual documentation, I changed it a bit.

@Noratrieb
Copy link
Member

I see yeah, I left the comment a few minutes ago so it was with the latest state of your changes

@rustbot

This comment has been minimized.

@rustbot rustbot added has-merge-commits PR has merge commits, merge with caution. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Aug 1, 2025
@rustbot
Copy link
Collaborator

rustbot commented Aug 1, 2025

Some changes occurred in src/tools/cargo

cc @ehuss

@rustbot

This comment has been minimized.

@stifskere stifskere force-pushed the feat/macros/hash_map branch from effb783 to bb50dc0 Compare August 1, 2025 22:26
@rustbot

This comment has been minimized.

@rustbot rustbot removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. has-merge-commits PR has merge commits, merge with caution. labels Aug 1, 2025
@stifskere stifskere force-pushed the feat/macros/hash_map branch 3 times, most recently from 89f4c4b to c8d57d1 Compare August 1, 2025 22:43
@rust-log-analyzer

This comment has been minimized.

@stifskere stifskere force-pushed the feat/macros/hash_map branch from c8d57d1 to 066023e Compare August 1, 2025 23:33
@Noratrieb
Copy link
Member

Thanks!
@bors r+

@bors
Copy link
Collaborator

bors commented Aug 2, 2025

📌 Commit 066023e has been approved by Noratrieb

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 2, 2025
Copy link
Contributor

@danielhenrymantilla danielhenrymantilla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

🤔 I wonder how the implementation would compare to doing:

<
    $crate::collections::HashMap<_, _, $crate::hash::RandomState>
    as
    $crate::convert::From<[(_, _); _]>
>::from([
    $(
        ($key, $value)
    ),+ // empty case still handled as `new()`.
])

Intuitively, doing a batch of .insert()s does require the map to go back to a fully usable-from-the-outside state, whereas a direct batched constructor such as From<[(K, V); N]> does not, so depending on the impl, there might be more room for optimisation with this more direct approach 🤔

$crate::collections::HashMap::new()
}};

( $( $key:expr => $value:expr ),* $(,)? ) => {{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very tiny nit 🙂:

Suggested change
( $( $key:expr => $value:expr ),* $(,)? ) => {{
( $( $key:expr => $value:expr ),+ $(,)? ) => {{

would make the intent clearer, and also reject hash_map!(,)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that makes sense, I suggest addressing this in a follow-up PR

let mut map = $crate::collections::HashMap::with_capacity(
const { $crate::repetition_utils!(@count $($key),*) }
);
$( map.insert($key, $value); )*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
$( map.insert($key, $value); )*
$( map.insert($key, $value); )+

@Noratrieb
Copy link
Member

I have checked the FromIterator implementations, they are not any more efficient than repeat inserts (but the array impl will blow the stack :D)

@danielhenrymantilla
Copy link
Contributor

Thanks for looking into these 🙏

@stifskere
Copy link
Author

I have checked the FromIterator implementations, they are not any more efficient than repeat inserts (but the array impl will blow the stack :D)

I was now checking about this you said, which feels weird, I thought that without recursiveness this wouldn't have a problem, so should I add another PR to the tracking issue? Wait for a comment period?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants