Skip to content

Molecule de-duplication lost in Topology roundtrips #2034

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mattwthompson opened this issue Mar 18, 2025 · 0 comments
Open

Molecule de-duplication lost in Topology roundtrips #2034

mattwthompson opened this issue Mar 18, 2025 · 0 comments
Assignees
Labels
polymer-performance Runtime of loading and/or parametrizing (bio)polymers

Comments

@mattwthompson
Copy link
Member

Describe the bug

Molecule de-duplication is needed to make several API points useful but can be very slow, so its results are cached after the first call. This cache, however, does not survive several roundtrips.

For better or worse, duplicating a Topology object using its constructor is a support feature and commonly-used downstream. Losing this cache forces its re-building and removes some of the benefit of caching it in the first place.

To Reproduce

>>> from openff.toolkit import Topology, Molecule
>>> topology = Topology.from_molecules([Molecule.from_smiles(n * "C") for n in range(1, 10)])
>>> _ = topology.identical_molecule_groups
>>> topology._cached_chemically_identical_molecules is None
False
>>> Topology(topology)._cached_chemically_identical_molecules is None
True

Additional context

One could

>>> new_topology = Topology(topology)
>>> new_topology._cached_chemically_identical_molecules = topology._cached_chemically_identical_molecules

but this is unstable

@mattwthompson mattwthompson added the polymer-performance Runtime of loading and/or parametrizing (bio)polymers label Mar 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
polymer-performance Runtime of loading and/or parametrizing (bio)polymers
Projects
None yet
Development

No branches or pull requests

2 participants