-
Notifications
You must be signed in to change notification settings - Fork 708
[ENH] Consistent 3D output for single-target point predictions in TimeXer
v1.
#1936
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
TimeXer
v1 and predictTimeXer
v1.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1936 +/- ##
=======================================
Coverage ? 85.40%
=======================================
Files ? 160
Lines ? 9462
Branches ? 0
=======================================
Hits ? 8081
Misses ? 1381
Partials ? 0
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am worried that we are removing tests but not adding tests.
If we are removing an API stepout, then we should remove a stepout in the testes to match, no?
it was an unintended removal, happened by mistake, sorry. The code stays the same in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, so it was a bug! Good that we do reviews.
Related question: is there a stepout we can remove now?
…to a single handling block
with that question in mind, I tried tweaking with a few of the tensor shapes, and I was able to reduce the number of stepouts. now, the default return is a 3d tensor - |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mention you reduced the number of stepouts, but I do not see a change in tests - did you forget to push them?
no i meant stepouts in the actual model code. We were handling quantile and non-quantile predictions separately inside the model. But with this PR, I have unified the handling to a single format for all types of metrics used (quantile or point). The code for the tests ultimately remains the same, since we are testing the output shape of predict API for If there is a misunderstanding, please let me know. |
Yes, I meant tests. I think thta one of the following must be true:
|
Seems that all the shape are well managed. Here you are using the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Repeating my review request:
- if there was a stepout for the forecaster in tests, please remove the stepout
- if there was no stepout, please review the tests why there was no stepout and/or why they still passed
Again, then the definition of "non-conformance" that we have applied to |
So then my proposal would be to add the test in and skip both |
Yeah we can do this, it will require just one line change in if not isinstance(class, (TimeXer,NBeats)):
# perform the output test for 3D And afterwards, we fix |
@phoeenniixx, that would work! Although I do not like skips per model inside tests, these are easy to forget there. This indicates that maybe there is too much being tested in a single test method. |
I agree, we need some refactoring. I think we should raise an issue as a checklist, so that we dont forget anything. In v2 also, there are a lot of things that can be refactored in test-framework. But these things imo, should have a lower priority, we should first achieve a stable (and complete) API for v2, just save the things along the way that needs refactoring, and we can do this afterwards? As we would anyway need a full refactoring round after first complete v2 like connecting v1 and v2 etc. We can do these things at that time? |
For now, we should focus on API conformance for v1 and design and implementation for v2... |
agreed - what is, in your opinion, @phoeenniixx, the quickest, simplest way that we ensure:
|
I could think of two ways:
IMO, the best way for now is the Hardcode way because this "non-conformance" is a special case that happened due to no concrete contract in past, but from now on, we have one concrete contract - the output format should be 3D for single target. And we are already aware of the non-conformant models, we should just raise an issue for them (for But this is mainly because we right now have a lot on our plate, but we should avoid such things in future, once we have a stable API for v2. |
What do you think about this @fkiraly, @PranavBhatP ? |
ok, @PranavBhatP, I agree with the arguments that we can cut corners because the skips will only apply to the models in the past. For this I think we should open an issue for nbeats and explicitly mention that the test internal skip should be removed. The first option would imply changes to the framework or tests first - they need refactoring, and I think this would be th better option, if we could make a wish - but it would cause scope creep, and feels like a distraction at the moment. What do you think, @PranavBhatP. |
Yes, this approach makes sense as a temporary fix, I will try changing the code for this. In this PR do I only skip tests for |
So, from what I can understand, you are solving the issue for |
yes, @phoeenniixx, that is what I had understood |
Then in that case the PR #1951 handles the changes specific to the tests in |
yeah i think this makes sense, right @fkiraly ? |
yes, sounds like a good plan to me too |
…1936 (#1951) #### What does this implement/fix? Explain your changes. Adds the test for the output shapes of the forward method of `TimeXer`. The test checks the output contract for the model, which is being implemented in #1936. Currently skipped under `test_timexer.py` due to obvious causes of failure i.e current version of model on main is still not updated to the required output contract.
this commit removes the test skip tag added in PR sktime#1966
This PR has been updated using the merged PR #1966 and I am running tests for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
self.enc_in = len(self.reals) | ||
|
||
self.n_quantiles = None | ||
self.n_quantiles = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please clarify why are we doing n_quantiles=1
and not n_quantiles = None
here?
As from what I have seen so far in the package, we always use n_quantiles = None
when not using QuantileLoss
, and making n_quantiles=1
may change that contract? Although it wont affect the working ig, but for the user this can be misleading?
That QuantileLoss
is being used, even when it is not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, self.n_quantiles
is being set to 1 because in the case of a point prediction, the median quantile is what is being predicted. So instead of setting self.n_quantiles = None
and having step-outs for checking the value of self.n_quantiles
inside the layer architecture, we simply set it to 1 and let the loss handle the rest.
For some context, setting self.n_quantiles = 1
simplifies how we handle the distinction between a loss like MAE
and QuantileLoss
. Refer this diff - https://github.com/sktime/pytorch-forecasting/pull/1936/files#diff-8c4bc78319ca93d3c0e93a38a0ee551ac3c4193f717955c0ce3477e4147a9153
TLDR; It removes unecessary "if" statements and step-outs in the logic for TimeXer
's output.
Maybe a comment would clarify this in the code, I will add it, thanks for highlighting this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohh, I see, Thanks!
If this is working, maybe then we should refactor other places as well where this if/else
block is being used because of n_quantiles = None
?
I jam just thinking it would be good if everything consistent with rest of the API....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just, I am a bit concerned about the meaning here- if a person is using this estimator, they would see n_quantiles=1
even for point predicitons (I am talking about the user who is just using the estimator and not actually reading the code) and this may confuse the user that "why this param is not None
like it is for other estimators in ptf
?"
Also, idts this if/else
block would be much of a overhead, so, if using this block keeps the user from getting confused, I think we should prefer that.
But that is my thought, Please let me know what is your reasoning here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It removes unecessary "if" statements and step-outs in the logic for TimeXer's output.
The reasoning is as simple as this :). Since we are standardising the output format, it makes more sense if we renamed this attribute to self.output_dim
or something like that, instead of self.n_quantiles
. That way we have a more generic meaning and assign different values to it only when there is a step-out to a different case (QuantileLoss in our case)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you kindly explain what API or parameter change exactly we are talking about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, in this PR @PranavBhatP is changing the Convention of using n_quantiles = None
(that happens currently in the package) when the user is not using QuantileLoss
(@PranavBhatP please correct me if I understood it wrongly), rather n_quantiles=1
is being introduced in absence of QuantileLoss
. (see comments #1936 (comment) and #1936 (comment)) and I was concerned about the meaning it could convey (see comment #1936 (comment)).
He suggested, we could rename the n_quantiles
to output_dim
and it would solve the issue of the meaning. But again this would lead to changes to the code across the whole package. And it would be breaking as a param name is being changed, so as you said, we could deprecate it, but I think we should introduce (if we decide to) to v2 and keep v1 intact, as v1 is already mature enough, and it may not even complete the deprecation cycle, as before that we may completely move to v2. I am assuming the deprecation cycle will complete after 2 realeases, hopefully we might release a full fledged v2 by then. So, I think if we are going to make this change, we should do this in v2.
I am still not very sure if this would be a good idea, that's why I wanted if you and @agobbifbk could comment your thoughts on this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me output_dim
is a good idea, it is more general. I don't think it is necessary to deprecate it if we rename because it is not a parameter that the user can modify, isn't? It just depend on the chosen loss function(s) right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but I am assuming n_quantiles
are right now used in documentation to define the shape of output when we use QuantileLoss
, so, ig this arg is visible to the user in the documentation? If yes, we might need to deprecate it? I am not sure, what's the process here😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a design document is the right process. Write up the vignettes now - and analyze whether n_quantiles
even appears in them.
Then propose alternative vignettes for future use, and from that develop a change and/or deprecation trajectory.
Reference Issues/PRs
What does this implement/fix? Explain your changes.
Small change in the code to provide a 3d output tensor for point predictions with 3rd dimension set to 1. With this change the output contract for
TimeXer
is(batch_size, predictions, 1)
where the 3rd dimension indicates a single target.(batch_size, predictions, num_quantiles)
where the 3rd dimension indicates the number of quantiles for which the output is generated.What should a reviewer concentrate their feedback on?
PR checklist
pre-commit install
.To run hooks independent of commit, execute
pre-commit run --all-files