Skip to content

exists() returns True when bucket versioning is enabled and object was deleted #194

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
dbragdonmw opened this issue Apr 30, 2025 · 4 comments

Comments

@dbragdonmw
Copy link

dbragdonmw commented Apr 30, 2025

It appears that if a bucket has object versioning enabled, calling exists() on an S3Path object of a deleted path will return True, is this expected behavior?

For reference, I am using python version 3.12.9 and s3path 0.6.1

@dbragdonmw
Copy link
Author

Looking into this more, it seems like this function is returning False on a path that is indeed versioned:

def _is_versioned_path(path):
    return hasattr(path, "version_id") and bool(path.version_id)

because the S3Path object in question does not have the version_id attribute. I tried to figure out when the S3Path object would have this attribute, and it does not seem like the PureS3Path constructor is adding version_id as an attribute, so I am not sure where it would be coming from.

@liormizr
Copy link
Owner

liormizr commented May 5, 2025

@dbragdonmw for buckets with versioning enabled we have to use the VersionedS3Path class instead of the regular S3Path
If there will be demand we can thing of a way to merge them together

Is the VersionedS3Path class solve your issue?

You can also see the doc's here

@dbragdonmw
Copy link
Author

Hi @liormizr thanks for getting back to me.

We were initially using s3path with the idea that, if an object is versioned but the latest version has a delete marker, then that object is considered to not exist, according to the S3Path.exists() function. I believe at some point this was the way that the logic was working, but I could be wrong.

Our workflow currently looks something like this:

path = S3Path("s3://some-bucket/path/to/some/key")
if path.exists(): # currently returns True if the object is versioned but currently has a delete marker
    with path.open('r') as fp: # code crashes here because file doesn't actually exist in a readable state
      data = fp.read()

If we were to use VersionedS3Path for this, we'd need to know the latest version of the object beforehand in order to get this functionality working again:

path = VersionedS3Path("s3://some-bucket/path/to/some/key", version_id="?")
...

As far as checking if a versioned object currently "exists", we don't particularly care about any specific versions as long as the current version has a delete marker attached to it, so VersionedS3Path wouldn't be what we exactly what we're looking for here. We really just want to check if the latest version of the object isn't marked as deleted.

If this is not in line with the philosophy of how to use S3Path, I understand, I just wanted to make sure we weren't using it incorrectly, and yes, I'd appreciate the addition of this feature if other people would like to see this behavior as well.

@liormizr
Copy link
Owner

liormizr commented May 7, 2025

Hi @dbragdonmw
I understand
How natively boto3 act in this scenario?
if you can show me an example with boto3 we can think how to integrate it to s3path

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants