Skip to content

feat(libstore): curl-based s3 #13752

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

lovesegfault
Copy link
Member

Motivation

#13084

Context

Early days, just making a draft for visibility and CI


Add 👍 to pull requests you find important.

The Nix maintainer team uses a GitHub project board to schedule and track reviews.

@xokdvium
Copy link
Contributor

Would it be conceivable to just outright replace aws-sdk-cpp with this? I don't think we'd want to support it if we land this eventually.

@lovesegfault
Copy link
Member Author

I was unsure about just removing the existing implementation outright, but I can do that if folks are OK with it?

@xokdvium
Copy link
Contributor

but I can do that if folks are OK with it?

Can't speak for everyone, but I feel ripping out aws-sdk-cpp would be very welcome if the new thing is more lightweight and reuses the curl infrastructure with feature parity. That would definitely be better than supporting both implementations.

@lovesegfault
Copy link
Member Author

@xokdvium I'm going to get this in a fully working state, and after I'm happy with it I'll tack on a final commit removing the old impl, so if there's tension we can discuss that point separately, or I can make it a separate PR, etc.

I'm with you that if this works well, we're better off just tossing the old impl, though that would mean folks building with old curl would lose the ability to talk to s3, but idk if we care to support people building non-standard configurations like that.

@github-actions github-actions bot added the store Issues and pull requests concerning the Nix store label Aug 14, 2025
@lovesegfault lovesegfault force-pushed the curl-based-s3-v2 branch 2 times, most recently from 2030275 to 86873ed Compare August 14, 2025 23:40
@Mic92
Copy link
Member

Mic92 commented Aug 15, 2025

I want to definitely test this with our hydra staging instance to ensure it still can upload stuff to s3 buckets before merging.

@lovesegfault
Copy link
Member Author

AFAICT this is ready for (very) preliminary testing; at least I was able to pull and push to an S3 store and auth just worked, like magic :)

@lovesegfault
Copy link
Member Author

Nevermind, I messed something up, buckets outside of us-east-1 don't work :P

@lovesegfault lovesegfault force-pushed the curl-based-s3-v2 branch 3 times, most recently from aefcffd to 3433232 Compare August 15, 2025 20:33
@lovesegfault lovesegfault force-pushed the curl-based-s3-v2 branch 3 times, most recently from 2dd60ad to 80e3327 Compare August 17, 2025 04:17
@lovesegfault lovesegfault marked this pull request as ready for review August 17, 2025 04:37
@lovesegfault lovesegfault requested review from Mic92 and xokdvium August 17, 2025 04:37
Copy link
Contributor

@xokdvium xokdvium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some preliminary comments

rehank0678

This comment was marked as spam.

rehank0678

This comment was marked as spam.

@Ericson2314
Copy link
Member

#13803 This might also help clean some thing up.

lovesegfault added a commit to lovesegfault/nix that referenced this pull request Aug 21, 2025
This commit replaces the AWS C++ SDK with a lighter curl-based approach
for S3 binary cache operations.

- Removed dependency on the heavy aws-cpp-sdk-s3 and aws-cpp-sdk-transfer
- Added lightweight aws-crt-cpp for credential resolution only
- Leverages curl's native AWS SigV4 authentication (requires curl >= 7.75.0)
- S3BinaryCacheStore now delegates to HttpBinaryCacheStore
- S3 URLs are converted to HTTPS URLs with proper authentication headers
- Function s3ToHttpsUrl converts ParsedS3URL to ParsedURL
- Multipart uploads are no longer supported (may be reimplemented later)
- Build now requires curl >= 7.75.0 for AWS SigV4 support

Fixes: NixOS#13084, NixOS#12671, NixOS#11748, NixOS#12403, NixOS#5947
PR: NixOS#13752
lovesegfault added a commit to lovesegfault/nix that referenced this pull request Aug 21, 2025
This commit replaces the AWS C++ SDK with a lighter curl-based approach
for S3 binary cache operations, addressing multiple issues and improving
the overall architecture.

## Key Changes

### 1. Replaced AWS SDK with aws-crt-cpp + curl
- Removed dependency on the heavy aws-cpp-sdk-s3 and aws-cpp-sdk-transfer
- Added lightweight aws-crt-cpp for credential resolution only
- Leverages curl's native AWS SigV4 authentication (requires curl >= 7.75.0)

### 2. Consolidated S3 support into HttpBinaryCacheStore
- S3BinaryCacheStore now delegates to HttpBinaryCacheStore
- Eliminates layer violation (S3 logic no longer in generic FileTransfer)
- S3 URLs are converted to HTTPS URLs with proper authentication headers

### 3. Improved S3 URL handling
- Added ParsedS3URL for structured S3 URL parsing
- Function s3ToHttpsUrl converts ParsedS3URL to ParsedURL
- Better support for custom endpoints and regions via query parameters

### 4. Fixed thread safety test
- threadSafety_MultipleProviders now actually tests concurrent access
- Spawns multiple threads accessing shared credential providers
- Tests both concurrent credential fetching and provider creation

## Benefits

- **Reduced memory usage**: Eliminates memory buffering issues that caused
  segfaults with large files (>3.5GB)
- **Fixed upload reliability**: Resolves AWS SDK chunking errors
  (InvalidChunkSizeError)
- **Resolved OpenSSL conflicts**: No more S2N engine override issues in
  sandboxed builds
- **Lighter dependencies**: ~500 lines less code, simpler build process
- **Better architecture**: Clean separation between storage and transport layers

## Breaking Changes

- Multipart uploads are no longer supported (may be reimplemented later)
- Build now requires curl >= 7.75.0 for AWS SigV4 support

## Migration

No action required for most users. S3 URLs continue to work with the
same syntax. Users directly using S3BinaryCacheStore class should
migrate to standard HTTP binary cache stores with S3 endpoints.

Fixes: NixOS#13084, NixOS#12671, NixOS#11748, NixOS#12403, NixOS#5947
PR: NixOS#13752

Co-authored-by: John Ericson <git@johnericson.me>
lovesegfault added a commit to lovesegfault/nix that referenced this pull request Aug 21, 2025
This commit replaces the AWS C++ SDK with a lighter curl-based approach
for S3 binary cache operations.

- Removed dependency on the heavy aws-cpp-sdk-s3 and aws-cpp-sdk-transfer
- Added lightweight aws-crt-cpp for credential resolution only
- Leverages curl's native AWS SigV4 authentication (requires curl >= 7.75.0)
- S3BinaryCacheStore now delegates to HttpBinaryCacheStore
- Function s3ToHttpsUrl converts ParsedS3URL to ParsedURL
- Multipart uploads are no longer supported (may be reimplemented later)
- Build now requires curl >= 7.75.0 for AWS SigV4 support

Fixes: NixOS#13084, NixOS#12671, NixOS#11748, NixOS#12403, NixOS#5947
PR: NixOS#13752
@lovesegfault lovesegfault force-pushed the curl-based-s3-v2 branch 2 times, most recently from ea041aa to bfb7b79 Compare August 21, 2025 05:55
@lovesegfault
Copy link
Member Author

It's so stupid that if your PR references itself it causes this horrible rebase spam. Sorry for the noise folks

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2025-08-20-nix-team-meeting-minutes-242-241/68245/1

@lovesegfault
Copy link
Member Author

I'm looking at whether aws-crt-cpp does caching, because we definitely don't want to fetch creds for each request. Should the cache TTL be:

  1. An option in nix.conf
  2. A store param s3://fooo?credential-cache-ttl=10
  3. An envvar NIX_AWS_CREDENTIALS_CACHE_TTL

Thoughts @Ericson2314 @xokdvium?

@arianvp
Copy link
Member

arianvp commented Aug 21, 2025

I do vaguely remember caching was also already sort of broken-ish previously due to us not initializing the AWS SDK globally once but for each nix-daemon connection. It would be great if we can fix that though.

Most AWS SDKs dont have the TTL tweakable. AWS temporary Credentials return an Expiration field and usually the caches automatically set TTL based on that. I expect aws-crt-cpp will probably have the same behaviour. (hopefully).

So ideally this isn't a tweakable parameter at all. The code has enough information from AWS STS API response to make TTL decisions

@arianvp
Copy link
Member

arianvp commented Aug 21, 2025

Caching is indeed already implemented by aws-crt:

https://github.com/awslabs/aws-c-auth/blob/cd9d6afcd42035d49bb2d0d3bef24b9faed57773/source/credentials_provider_cached.c

which is used in the default credential chain:

https://github.com/awslabs/aws-c-auth/blob/5224ea9bf0046eed8cd7035d384a0339f2dcf33a/source/credentials_provider_default_chain.c#L341-L344

which is used here:

https://github.com/awslabs/aws-crt-cpp/blob/main/source/auth/Credentials.cpp#L283

I think this is just "already working"

Just gotta make sure that the CredentialsProvider we set up is shared across requests and ideally also across nix-daemon connections

@lovesegfault lovesegfault force-pushed the curl-based-s3-v2 branch 2 times, most recently from 33fe27d to 102852e Compare August 21, 2025 18:28
@lovesegfault
Copy link
Member Author

I've implemented reusing of the CredentialsProvider but it's causing some kind of a deadlock in nix copy and I can't quite figure out why. I'll try a bit more, but might end up pushing a broken commit for advice.

This commit replaces the AWS C++ SDK with a lighter curl-based approach
for S3 binary cache operations.

- Removed dependency on the heavy aws-cpp-sdk-s3 and aws-cpp-sdk-transfer
- Added lightweight aws-crt-cpp for credential resolution only
- Leverages curl's native AWS SigV4 authentication (requires curl >= 7.75.0)
- S3BinaryCacheStore now delegates to HttpBinaryCacheStore
- Function s3ToHttpsUrl converts ParsedS3URL to ParsedURL
- Multipart uploads are no longer supported (may be reimplemented later)
- Build now requires curl >= 7.75.0 for AWS SigV4 support

Fixes: NixOS#13084, NixOS#12671, NixOS#11748, NixOS#12403, NixOS#5947
@lovesegfault
Copy link
Member Author

I got it :)

@lovesegfault
Copy link
Member Author

@arianvp I think that with the latest changes we now have one credential provider per profile per daemon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation store Issues and pull requests concerning the Nix store with-tests Issues related to testing. PRs with tests have some priority
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants