Skip to content

Parameterized tests with generator as data source fail on second pytest.main() call #13409

@steadyfirmness

Description

@steadyfirmness

Description:

When using a generator function as the data source for a parameterized test, the second call to pytest.main() fails with an ID mismatch error. The test works correctly on the first run but fails on subsequent runs.

How to reproduce:

Create a test file with the following content:

import pytest

def data_generator():
    yield 1
    yield 2

@pytest.mark.parametrize("bar", data_generator(), ids=lambda x: f"dynamic_{x}")
def test_foo(bar):
    pass

if __name__ == '__main__':
    base_cmd = [
        "-q",
        "--collect-only"
    ]
    pytest.main(base_cmd)  # First run - works
    pytest.main(base_cmd)  # Second run - fails

Run the test file

Actual behavior:

First run succeeds:

tests/test_bar.py::test_foo[dynamic_1]
tests/test_bar.py::test_foo[dynamic_2]

Second run fails with:

ERROR collecting tests/test_bar.py
In test_foo: 1 parameter sets specified, with different number of ids: 2

Expected behavior:

Both runs should successfully collect and run the tests with the same parameter sets.

Versions:

Python: 3.11

pytest: 8.3.5

OS: macos

Additional context:

The issue appears to be related to the generator being exhausted after the first run. When the test collection happens again in the second pytest.main() call, the generator has already been consumed, leading to the ID mismatch error.

A workaround is to convert the generator to a list:

@pytest.mark.parametrize("bar", list(data_generator()), ids=lambda x: f"dynamic_{x}")

However, this defeats the purpose of using a generator for memory efficiency with large datasets.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions