Job service: fix bug in getting incomplete jobs that Ray no longer knows about #1297
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What's changing
get_upstream_job_status
get_upstream_job_status
JobStatus
which indicates a job is not recoverable (i.e. it was running but now Ray no longer knows of its existence)get_job
to useget_upstream_job_status
instead of the Ray client directlyNOTE: This will appear in the Lumigator server log when the job can no longer be found in Ray, and prevent errors escaping into the UI via the API. No further attempts to contact Ray for status updates of that job will be made.
Closes #1296
How to test it
Steps to test the changes:
For folks who have a database already in this situation:
make local-up
make local-down
andmake clean-all
to remove everything (except the database it seems, which is good for this test)make local-up
and visit the UI .. we shouldn't be getting the wild logs in the server nowAdditional notes for reviewers
Although we're adding a new
JobStatus
, it will never be surfaced via the API as the current logic only returns jobs which have a presence in Lumigator's storage AND in Ray (merged).I already...
/docs
)