Skip to content

Optionally force invalidation on cache eviction #853

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

eggyal
Copy link

@eggyal eggyal commented May 8, 2025

Adds a force_invalidation_on_cache_eviction option (better naming suggestions most welcome!) to the #[salsa::tracked] attribute macro, which ensures that a query so marked is always deemed to have changed if there is no cached result present for it in the database—irrespective of its inputs' validity. Consequently downstream queries will be recomputed.

This is necessary if such downstream queries might hold pointers into a query's cached result. Such pointers would be invalidated by cache eviction, even if the query's inputs are unchanged (and therefore any recomputed result will be unchanged: a situation which typically does not cause recomputation of downstream queries).

My use-case is in the codegen backend of a JIT compiler. Suppose a query returns the (unlinked) object code for a given translation unit. A program may contain many such translation units, whose object code must be linked together through a relocation step: we therefore might have a second query that returns the now-linked object code for a given translation unit. However, such links into other translation units' object code would be invalidated by the eviction of that object code from the cache. We need to ensure that, in such a case, the second query is re-run for the dependent translation unit so that its object code gets correctly linked to the current revision's freshly recalculated object code for that dependency.

This arose from a brief discussion on Zulip. cc @Veykril

Copy link

netlify bot commented May 8, 2025

Deploy Preview for salsa-rs canceled.

Name Link
🔨 Latest commit b5580ad
🔍 Latest deploy log https://app.netlify.com/sites/salsa-rs/deploys/681d224647c69c00080ae5df

Copy link

codspeed-hq bot commented May 8, 2025

CodSpeed Performance Report

Merging #853 will not alter performance

Comparing eggyal:force_invalidation_on_cache_eviction (b5580ad) with master (af69cc1)

Summary

✅ 12 untouched benchmarks

@eggyal eggyal force-pushed the force_invalidation_on_cache_eviction branch from f2566af to c3d3b11 Compare May 8, 2025 11:09
@MichaReiser
Copy link
Contributor

How is it different from report_untracked_read?

@eggyal
Copy link
Author

eggyal commented May 8, 2025

If a cached result is present, it can be reused - whereas report_untracked_read will force recalculation irrespective. Codegen is pretty expensive, so I'd rather reuse the cached results if I can!

@MichaReiser
Copy link
Contributor

I don't think what you have here will work. Salsa might backdate your query even if you return changed, in which case the old memo is gonne, but the dependent queries won't rerun (they now all point into the void).

Would something similar to our AstNodeRef work for you

https://github.com/astral-sh/ruff/blob/c29f24768c5b17b18409994daa7cbaf4458c31e2/crates/ty_python_semantic/src/ast_node_ref.rs#L35-L45

@eggyal
Copy link
Author

eggyal commented May 8, 2025

I don't think that the AstNodeRef approach solves the problem here. It isn't enough for downstream queries to keep upstream object code into which they link alive through shared ownership. Suppose the following occurs:

  1. in revision R1, downstream object D links to upstream object U;
  2. in revision R2, U is evicted from the cache but is otherwise unchanged and a new downstream object E that links to U is created: now D and E link to different copies of U.

In R2, I still need to recompute D to ensure that it links to the same copy of U as does E.

@eggyal
Copy link
Author

eggyal commented May 8, 2025

Aren't queries only backdated if they are unchanged from their previous/cached value? Here there is no cached value.

@eggyal eggyal force-pushed the force_invalidation_on_cache_eviction branch from c3d3b11 to 283980e Compare May 8, 2025 12:05
@MichaReiser
Copy link
Contributor

Have you tried no_eq, which skips backdating (it always considers the value as changed, forcing all dependent queries to rerun)

@eggyal eggyal force-pushed the force_invalidation_on_cache_eviction branch from 283980e to 255ffd3 Compare May 8, 2025 12:20
@MichaReiser
Copy link
Contributor

attribute macro, which ensures that a query so marked is always deemed to have changed if there is no cached result present for it in the database—irrespective of its inputs' validity. Consequently downstream queries will be recomputed.

I don't think I understand this sentence. A query is always considered changed if there was no cached result before.

@eggyal
Copy link
Author

eggyal commented May 8, 2025

Have you tried no_eq, which skips backdating (it always considers the value as changed, forcing all dependent queries to rerun)

AIUI, that impacts whether a newly computed value is considered equal to a cached one—but I am concerned with the situation where there is no cached value. And while this would (as you note) skip backdating, I again don't think backdating is a concern for me, because AIUI it only occurs when there is a cached value to be compared against the newly computed one.


I don't think I understand this sentence. A query is always considered changed if there was no cached result before.

I had understood from the Zulip discussion (linked in the first comment above) that that isn't the case: a query whose inputs are all unchanged would itself be deemed unchanged even if its previous result is no longer in the cache. Indeed, I think this would also be the case if the cached result of a high durability query has been evicted but only low durability inputs have changed: in that case, the query will (shallow) verify as unchanged despite the absence of its cached result. (It's this latter case that my most recent pushes were, erroneously, attempting to address—I'll take a closer look and add some specific tests for this shortly).

@eggyal eggyal force-pushed the force_invalidation_on_cache_eviction branch from 255ffd3 to 4047309 Compare May 8, 2025 14:04
@eggyal
Copy link
Author

eggyal commented May 8, 2025

Can't reproduce this test failure on my machine. It's also super weird, as it's reported to be a failed assert_eq! at src/storage.rs:189:23 but there's no such assertion anywhere in that file nor any invoked from calls made on that line (which is merely Box::new(self.clone())).

Smells a bit like UB... but unless C::FORCE_INVALIDATION_ON_CACHE_EVICTION is set (which is only the case for the intermediate_result query in the newly added force_invalidation_on_cache_eviction test), none of the changes I've made should have any effect. Indeed, my changes should be entirely optimised away by rustc everywhere else.

But it doesn't appear to be a transient error either, as it has consistently occurred on the last three pushes... and I see that other PRs aren't suffering from it...

Will come back to this once I've had some more coffee.

@eggyal eggyal marked this pull request as draft May 8, 2025 14:43
@eggyal
Copy link
Author

eggyal commented May 8, 2025

I've tried reproducing on aarch64-unknown-linux-gnu (my primary workstation), x86_64-apple-darwin (my laptop) and x86_64-unknown-linux-gnu (a 4-processor VM on the latter, running Ubuntu 24.04.2 per the GitHub runner); all using the same 1.80.0 rust toolchain as the GitHub runner: all three systems successfully pass all 202 tests invoked with cargo nextest run --workspace --all-features --all-targets --no-fail-fast as per the failing GitHub action.

So this definitely seems to be a rather flaky test. Notably the VM that is closest to the GitHub environment completed the 202 tests in just under 22 seconds, versus the failing GitHub action's 1.2 seconds: and given that these are failures in parallel tests, timing could well be significant. Nevertheless, I'm going to push again and see what happens....

@eggyal eggyal force-pushed the force_invalidation_on_cache_eviction branch from 4047309 to b5580ad Compare May 8, 2025 21:29
@eggyal
Copy link
Author

eggyal commented May 8, 2025

Yeah, okay, it's passing CI now despite no changes.

I therefore strongly believe that the failures weren't related to this PR... but it seems likely that there's some preexisting unsoundness somewhere in the project...

@eggyal eggyal marked this pull request as ready for review May 8, 2025 21:36
@MichaReiser
Copy link
Contributor

The need for this feature isn't clear enough to me (why does it have to be in Salsa, what are other approaches). I'll leave this for someone else to review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants