Skip to content

Fix AsyncGroup.create_dataset() dtype handling and optimize tests #3050 #3059

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

dhruvak001
Copy link

@dhruvak001 dhruvak001 commented May 14, 2025

Fixes #3050

  1. Core Fix in AsyncGroup.create_dataset():
  • The issue was that when creating a dataset without providing data, the method wasn't properly handling the required dtype parameter
  • Added a validation check that:
    -- If dtype is not provided in the arguments
    -- And if data is None (meaning no data is being provided to infer the dtype from)
    -- Then raise a clear error message saying "dtype must be provided if data is None"
  • This ensures that create_array() always receives the required dtype parameter, preventing potential errors downstream
  1. Test Performance Improvements in test_properties.py:
  • The tests test_oindex and test_vindex were timing out due to generating too many complex test cases
  • Made several optimizations:
    -- Removed the time limit by setting deadline=None and suppressing the "too slow" health check
    -- Reduced the complexity of test cases by:
    -- Limiting arrays to maximum 3 dimensions (down from 4)
    -- Setting maximum side length to 8 (to prevent very large arrays)
    -- Adding assumptions to prevent repeated indices in test cases
    -- For test_vindex, limiting the result shape to 2 dimensions to reduce complexity
  • These changes maintain test coverage while making the tests run more efficiently

Please let me know if there is any changes needed to the approach in the tests or the issue fix.

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/user-guide/*.rst
  • Changes documented as a new file in changes/
  • GitHub Actions have all passed
  • Test coverage is 100% (Codecov passes)

@dhruvak001
Copy link
Author

pre-commit.ci autofix

@dhruvak001
Copy link
Author

dhruvak001 commented May 17, 2025

HI @d-v-b can you please review the PR. Also do I need to make separate tests for this code changes?

Copy link
Contributor

@DimitriPapadopoulos DimitriPapadopoulos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps two distinct PRs would have been preferable, but I'm not maintainer so you can keep it as is for now.

@d-v-b
Copy link
Contributor

d-v-b commented May 19, 2025

i'll look at this later today

@d-v-b
Copy link
Contributor

d-v-b commented May 19, 2025

can you explain some of the changes to the hypothesis tests?

@dhruvak001
Copy link
Author

@d-v-b When running tests locally, I was encountering issue with slow test running or test running more than the default deadline for some particular tests. I tried some changes with the fix but still it was the same. Hence I tried to avoid and optimize the test codes.

@d-v-b
Copy link
Contributor

d-v-b commented May 19, 2025

@dhruvak001 can we see what happens in our CI tests if you roll back the changes to the hypothesis tests?

@dhruvak001
Copy link
Author

dhruvak001 commented May 19, 2025

@d-v-b yup it passes even after reverting changes. Thankyou.
And for the codecov patch error do i need to add a special testcase for this particular issue changes ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AsyncGroup.create_dataset() might call AsyncGroup.create_array() without dtype argument
3 participants