Skip to content

Do not import boto3 when using PureS3Path #164

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
iamthebot opened this issue Apr 1, 2024 · 12 comments
Closed

Do not import boto3 when using PureS3Path #164

iamthebot opened this issue Apr 1, 2024 · 12 comments

Comments

@iamthebot
Copy link

This is different from #122 -- basically, we're using PureS3Path in a CLI. Currently, as written, importing s3path also imports boto3, does some initialization, etc. This is relatively slow which impacts the CLI responsiveness. We've made the import lazy as a workaround but this means not being able to eg; use PureS3Path for type hints.

Rather than making boto3 an optional dependency, can we make the boto3 import/initialization not occur for PureS3Path?

@liormizr
Copy link
Owner

liormizr commented Apr 2, 2024

Hi @iamthebot
So you are fine with boto3 as a requirement/dependency how ever you don't want the initialization
Just for curiosity, how much time it takes?

In any case I'll think when kind of lazy can we do, maybe we can solve issue #122 at the same time
Two questions:

  1. do smart-open import initialization boto3 as well?
  2. can we do the change only for python version +3.12?

@iamthebot
Copy link
Author

iamthebot commented Apr 2, 2024

@liormizr yeah, boto3 is totally fine as a requirement/dependency. That's totally reasonable. I guess if you wanted to address #122 technically boto3 could be an optional requirement (and if it's not available only PureS3Path is usable).

What would be nice to do is delay boto3 import/initialization until it's needed for eg; S3Path. This makes the import of this package very fast and we only pay the initialization cost when it's needed. This way merely importing (but not instantiating) aPureS3Path or S3Path are both fast.

I believe we can do this only for Python 3.12 onwards. I don't think this needs to be backported, tbh.

@liormizr
Copy link
Owner

liormizr commented Apr 2, 2024

@iamthebot how much time boto3 adds?

@iamthebot
Copy link
Author

@liormizr that initialization is variable but between 100-300ms last I tested it. P99 of 500ms but that's fairly rare.

liormizr added a commit that referenced this issue Apr 2, 2024
@liormizr
Copy link
Owner

liormizr commented Apr 2, 2024

@iamthebot I learned something new 😊
In my env I got now 90ms
with boto3 ~100ms
and without ~15ms
In average

BTW smart-open also imported boto3 so it need's to be lazy as well.

PR is ready
I'll try to push a new version this week

You are more then welcome to review

@iamthebot
Copy link
Author

Thank you @liormizr! Happy to test it on Friday (pretty busy today and tomorrow) if that's not too late.

@iamthebot
Copy link
Author

This LGTM!

liormizr added a commit that referenced this issue Apr 11, 2024
@liormizr
Copy link
Owner

Version 0.5.3 deployed

@iamthebot
Copy link
Author

@liormizr can you also merge this PR to release the conda-forge package? Thanks!

@liormizr
Copy link
Owner

liormizr commented Apr 11, 2024

@iamthebot we found an issue with our laze import
new version will be deployed today
PR #167
Issue: #160 (See last comments)

@liormizr liormizr reopened this Apr 11, 2024
@liormizr
Copy link
Owner

liormizr commented Apr 11, 2024

0.5.5 was deployed with the fix.
@iamthebot I'll deploy to conda-forge package version 0.5.5

@liormizr
Copy link
Owner

@iamthebot conda-forge package was deployed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants